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ABSTRACT 


The problem of wavefront reconstruction is important in high precision optical systems, 
such as astronomical telescopes, where it is used to estimate the distortion of the collected 
light caused by the atmosphere and corrected by adaptive optics. A generalized orthogo- 
nal wavelet wavefront reconstruction algorithm is presented in this research for use with 
gradient measurements from a Shack-Hartmann wavefront sensor. This algorithm can be 
implemented using a number of different wavelets for improved performance in the pres- 
ence of noise. An extension of this algorithm is also presented to provide wavefront es- 
timation in the presence of isolated branch points where the phase is undetermined. The 
wavefront is obtained by augmenting the wrapped observations with a filtered curl of the 
vector field. The wavefront estimation can then be used for surface control of a deformable 
mirror. A third contribution is in deformable mirror surface control. The control signals 
to a deformable mirror are computed that minimize the wavefront error using constrained 
optimization to ensure that the hardware actuator voltage limits are satisfied. A sequence 
of optimal solutions is used to verify the linear model of a deformable mirror. A multigrid 


approach to the optimization problem is shown to improve computation efficiency. 
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Executive Summary 





Optical systems, such as astronomical telescopes and laser weapons, are highly affected by 
perturbation from the propagating medium. The key problem addressed in this dissertation 
is how to estimate this perturbation of the optical waveform and compensate for its effects. 
In addition, optical wavefront reconstruction is used in many modern applications of engi- 
neering and science to provide insight into the performance of optical systems that can be 


used for correction and improvement. 


Wavefront reconstruction can estimate the atmospheric distortion on the propagating wave 
of light. Adaptive optics (AO) uses information about the wavefront to correct the distor- 
tion and results in improved system performance. This technique can also detect optical 
manufacturing defects, thermal fluctuations of components, gravity sag of the components, 
and optical alignment of the components that also affect the wave. These defects are cor- 
rected with active optics. While AO corrections are usually small relative to the wavelength 


of light and rapidly changing, active optics corrections are large and slowly change. 


Both AO and active optics systems commonly use a deformable mirror (DM) as the device 
to apply the correction. A DM has a reflective surface with actuators along the back struc- 
ture that apply forces causing the mirror surface to adapt to a desired shape. A control law 
determines the individual voltages for each actuator that cause the mirror to take the desired 
shape. The common practice is that the mirror forms the conjugate of the wavefront, then 


the light reflecting off the mirror is a planar wavefront. 


In this research we address the problems of phase reconstruction and control of the DM 
to compensate for phase distortions. The results are a number of algorithms which are 
efficient and scalable to systems with a larger number of sensor measurements and also 


robust in the presence of phase singularities due to high levels of atmospheric turbulence. 


To address the research objectives, this dissertation has three contributions in AO. The first 
is wavefront reconstruction for a large number of sensor measurements. An algorithm for 
wavefront reconstruction is presented that uses orthogonal wavelet filters in a tree structure 
to estimate the relative phase of the wavefront from gradient measurements. The tree struc- 


ture is implemented with two-dimensional quadrature mirror filters (QMFs), which yields 


XVii 


a computationally efficient approach. The measurements contain noise from the sensor and 
it is desirable for the algorithm to mitigate the impact of noise on the resulting wavefront. 
The noise filtering properties of this algorithm depend on the chosen wavelet filters used 
in the QMF. A modification for the Haar wavelet filters is presented that removes all de- 
pendencies on the boundary. The algorithm is designed to reconstruct a square wavefront 
of size 2" x 2" for integer N. Although there are applications of AO systems with square 
apertures, an additional modification is proposed to handle realistic apertures with other 
shapes. The wavefront reconstruction algorithm is tested using simulated wavefront sensor 
data. 


This algorithm has been designed for irrotational vector field gradient measurements, where 
there is no phase ambiguity and the phase is well defined at every point. This is, in gen- 
eral, the case of distortion generated by low atmospheric turbulence. Under more severe 
turbulence conditions, the intensity of the optical field might be zero at isolated points, thus 
causing phase uncertainties. In this case, the measured phase gradient becomes rotational 
and it is characterized by phase uncertainty, and branch points, which cause problems in all 


standard least-squares algorithms. 


The second contribution is in wavefront reconstruction in the presence of significant atmo- 
spheric turbulence causing degradation of the optical beam. In this case, the measurements 
contain a rotational vector field component. An approach using a non-orthogonal decom- 
position to modify the rotational vector field to be irrotational is presented. The wavefront 
reconstruction algorithm operates on the irrotational measurements and estimates the phase 
that is consistent with the original measurements with rotational components. Our results 
show the wrapped phase measurements to make a comparison between the simulated phase 
and reconstructed wavefront phase. The algorithm is tested with simulated measurements 
of wavefronts. This approach can be applied to any wavefront reconstruction algorithm as 


the measurements are modified before the algorithm. 


A third contribution is in DM surface control. The commands are calculated as the solu- 
tion of a constrained optimization problem. The optimizer uses a linear influence function 
model for each actuator and determines a set of voltages that matches a desired surface 
shape. Although the model is linear, we expect nonlinear performance on laboratory hard- 


ware. An experiment performing mirror surface control using optimization is presented 


XVili 


where a sequence of optimal problems are used to verify the linear model of a DM. The 
hardware configuration uses a sensor which has many more measurement values than ac- 
tuators. To decrease computation time, a multigrid approach to the optimization problem 
is used, which results in the same optimal solution and is 2.5 times faster. This research 


verified the control approach using mathematical optimization on laboratory hardware. 


The research in this dissertation can be applied to a variety of AO systems. Wavefront 
reconstruction and mirror surface optimization are relevant to many military applications 
of precision optical systems. This research supports military capabilities in information 


dominance and directed energy weapon development. 
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CHAPTER 1: 


Introduction 





Precision optical systems, such as high-resolution imaging or laser weapons, are highly 
affected by perturbation from the propagating medium. The key problem addressed in this 
dissertation is how to estimate this perturbation of the optical waveform and compensate 
for its effects. Wavefront reconstruction is used in many modern applications of engineer- 
ing and science to provide insight into the performance of optical systems that can be used 
for correction and improvement. Wavefront reconstruction is particularly important in as- 
tronomy to estimate the distortion of the collected light caused by the atmosphere [1], [2]. 
In modern telescopes, this information is used to correct for the distortion using adaptive 
optics (AO) to reduce the distortions detected by wavefront reconstruction and gather high- 
quality observations of celestial objects. The AO corrections occur on a small time scale 
with updates faster than ten times a second. Wavefront reconstruction can also sense distor- 
tions from a variety of other sources, such as manufacturing defects, thermal fluctuations, 
gravity sag, and optical alignment. These corrections are performed by active optics [3] 


and occur on a longer time scale with corrections only once a second or slower. 


An AO or active optics system contains three common components: a wavefront sensor, 
a computer, and a deformable mirror (DM) [1]. The computer uses measurements from 
the sensor to perform the wavefront reconstruction and then commands the DM actuators. 
The actuators cause forces along the back of the mirror structure and the mirror surface 
deflects to form the conjugate shape of the wavefront. The incoming light reflects off the 
DM surface and the wavefront becomes planar. The planar wavefronts improve the optical 
performance of the system, and the collected images have improved angular resolution to 
distinguish fine details on the distant object. The AO computer must estimate the wavefront 
in a short time period and calculate the actuator commands for the DM. If either procedure 
takes too long, the distortion may change, and the intended correction may reduce optical 


performance. 


Another application of wavefront reconstruction is in laser weapon systems, which require 


AO for long horizontal distances to targets. As the laser beam propagates from the telescope 


through the atmosphere to the target, the beam quality degrades from atmospheric effects 
and the weapon loses effectiveness. AO is used to apply a correction before the beam is 
emitted to reach the target and retain the high-energy laser performance. 


Large telescopes are placed high on mountains that have excellent atmospheric “seeing” 
conditions. The Fried parameter (or Fried coherence length) quantifies the quality of the 
“seeing” and the best locations on Earth may have a Fried parameter value of 20 to 40 cen- 
timeters. If the telescope aperture diameter is larger then the Fried parameter, the telescope 
is limited by the atmospheric turbulence; incorporating AO into the telescope can overcome 
the limitation imposed by the atmosphere. Science missions require larger primary mirrors 
to increase the angular resolution and gather more light from the science objects of distant 
stars and their orbiting planets. Large monolithic mirrors are limited in maximum surface 
dimensions and may take years to correctly manufacture, polish, and test [4], [5]. This lim- 
itation has led to segmented mirror telescopes, where many smaller mirrors are combined 
to fill the large aperture [3]. These technologies are used in existing telescopes such as the 
Gran Telescopio CANARIAS (GTC) and the W. M. Keck observatory, as shown in Fig- 
ure 1.1. A thorough historical perspective on segmented mirrors can be found in [6]. These 
telescopes are able to observe faint science objects using the highest fidelity in large-scale 


optics, active control, and sensor technology. 





Figure 1.1. Left: The GTC primary mirror located in La Palma, Spain, is displayed (courtesy 
of GTC Digital, from [7]). Right: This is overhead view of the W. M. Keck telescopes at 
the observatory on Mauna Kea, Hawaii, is displayed (courtesy of NASA JPL and W. M. Keck 
Observatory, from [8]). 


Future telescope designs incorporate technology advancements to expand the state-of-the- 


art science capabilities in celestial formations and fundamental physics. The Thirty Meter 





Figure 1.2. Left: Artist rendition of the TMT is displayed (courtesy of TMT Observatory 
Corporation, from [9]). Right: Artist rendition of the GMTO is displayed (courtesy of GMT 
Organization, from [10]). 


Telescope (TMT) and Giant Magellan Telescope (GMT) will each be larger than 25 me- 
ters in diameter [9], [10]. Current artistic renderings of these observatories are shown in 
Figure 1.2. The designs require a combination of optics, structures, and controls to make a 


functional telescope. 


Space-based telescope designs now incorporate segmented mirror technology. For exam- 
ple, the James Webb Space Telescope (JWST) consists of 18 mirror segments to form a 6.5 
meter primary mirror [11]. A conceptual rendering of the JWST is shown in Figure 1.3. 
Current rocket launch vehicles and fairings can support a stowed telescope configuration 
that deploys the primary mirror on orbit. After deployment, the mirror segments are posi- 


tioned precisely to form the primary mirror. 


In addition to segmented mirror technology, space-based telescopes incorporate active op- 
tics for on-orbit conditions that cause mirror deformations. Active optics are desired by 
program managers for risk mitigation of manufacturing processes that result in an unno- 
ticed defect until operation. For these reasons, active optics are highly desirable for space- 


based telescopes, as they balance performance risk and system cost. 


To mature the segmented mirror with active optics technology for spacecraft, the Segmented 
Mirror Telescope (SMT) was built for the Segmented Mirror Demonstrator (SMD) pro- 
gram. Its lightweight, segmented primary mirror was constructed by sophisticated manu- 


facturing processes to meet the size, weight and power (SWAP) requirements imposed on 





a 


Figure 1.3. Left: The NPS laboratory of the Segmented Mirror Telescope testbed is shown. 
Right: Artist rendition of the James Webb Space Telescope is displayed (courtesy of NASA, 
from [13]). 


satellites [12]. The mirrors were not of high enough quality for an operational mission. 
The manufacturing processes have matured since the SMD program finished and industrial 
methods are capable of producing excellent mirror surfaces. The SMT is now located at 
the Naval Postgraduate School (NPS) for research and shown in Figure 1.3. 


Another important application in AO and segmented mirrors with active optics for the 
Department of Defense (DOD) is rapidly fielded imaging systems in support of military op- 
erations. Segmented mirrors with active optics technology can field an imaging spacecraft 
much faster than conventional imaging spacecraft by decreasing the payload manufactur- 
ing schedule. An individual segment can be produced in a few months time and multiple 
segments can be produced concurrently. Many mirrors can be manufactured and the mir- 
rors with the best optical surfaces can be determined by wavefront reconstruction. Active 
optics reduce risk on the primary mirror fabrication by reducing polishing rework since the 
system can compensate for manufacturing defects. The most compelling justification for 


this technology is the cost savings from reduced schedule dedicated to the primary mirror. 


1.1 Research Objectives 

The first objective of this dissertation research is to develop a wavefront reconstruction al- 
gorithm that can be applied to large telescopes. A generalized orthogonal wavelet phase 
reconstruction algorithm is described. This approach is appropriate for AO systems with 


larger aperture size and increasing the number of sensor measurements. The increase in 


number of measurements causes issues for the least-squares solvers which rely on matrix- 
vector multiplications. The approach presented in this dissertation is comparable with least- 
squares algorithms but has lower computational cost. The generalized orthogonal wavelet 
approach allows for filters that have improved noise-rejection properties compared to other 
existing algorithms. The phase reconstruction algorithm is tested using simulated wave- 


front sensor data. 


The second objective of this dissertation research is to address wavefront reconstruction for 
significant wavefront distortion. Although many applications of phase reconstruction are 
used for weak disturbances on the wavefront, there are applications where the wavefront 
may experience significant distortion. Strong distortion of the wavefront can occur in long, 
horizontal paths through the atmosphere, such as in the maritime environment. Least- 
squares algorithms degrade with measurements from the significant distortion such that the 
wavefront estimate does not represent a meaningful quantity. The work presented in this 
dissertation allows for a correct wavefront to be reconstructed. The modification is tested 


with simulated measurements of wavefronts. 


The final objective of the dissertation research is the control of a DM using wavefront 
sensor data and a mathematical optimization algorithm. The optimizer result selects the 
actuator commands to the DM that matches a desired mirror surface shape. The approach 
relies on a linear model of actuator influence and the validity of this control approach 
is investigated. Prior to this work, the technique has only been validated using computer 


simulation. This dissertation research verifies the control approach on laboratory hardware. 


1.2 Dissertation Organization 

In Chapter 2, the fundamentals of two-dimensional signal processing are presented. The 
signal representation notation and transforms that are used throughout the dissertation are 
explained. The wavelet phase reconstruction algorithm is presented in Chapter 3. The 
algorithm is an approach that recursively operates on gradient measurements to form an 
estimated wavefront. In Chapter 4, the branch point modification is presented for wave- 
front reconstruction algorithms. Mirror surface control using optimization is discussed in 
Chapter 5 and an experiment is conducted to verify the linear influence model of a DM. 


Concluding remarks and future research opportunities are given in Chapter 6. 
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CHAP TER 2: 
Two-Dimensional Signal Processing 





A signal conveys information about a process of interest and usually is represented as a 
function of an independent variable, such as f(t). Although in most applications the in- 
dependent variable f is a scalar indicating time, in a more general setting, it is a vector 
t = |t,f,...,t] indicating time, space, or a combination. This leads to the definition 
of multidimensional signals f(t,,t2,...,t,). In this chapter, we present the fundamentals 
of multirate and multiresolution signal processing as applied to multidimensional signals. 
These concepts will be the basis of the phase reconstruction algorithms presented in subse- 


quent chapters. 


Although most physical signals evolve in a continuous domain of time and/or space, they 
are usually processed numerically in the discrete (or sampled) domain using digital sig- 
nal processing (DSP). We denote this signal by f(t,,f2,...,t) in the continuous domain 
and f[m1,n2,...,nv] = f(mpAty,n2Ate,...,nyAty) in the discrete domain, where each nj; 
is an integer for i= 1,2,...,M. The choice of variable name and brackets distinguish 
the difference between continuous and discrete signals. The terms At), At,...,Aty de- 
note the sampling intervals in each respective dimension. Although general sampling of 
multidimensional signals is not a straightforward extension of one-dimensional (1D) sam- 
pling [14], in the case of simple sampling on a rectangular grid we can still use the well 
know sampling theorem. In this way, the sampling frequencies F; = 1/At; are chosen to be 


larger than twice the maximum frequency associated to the respective variables. 


Many important signals in imaging applications are represented as two-dimensional (2D) 
signals. In this dissertation, we are dealing with 2D signals that represent the phase of an 
optical field across a telescope aperture represented by two spatial coordinates, x and y. 
For simplicity, sometimes multidimensional signals are abbreviated with vector notation as 
fn] for n = [nj,n2]. In some places in this dissertation, a signal is referred to only by its 
symbolic letter name but no brackets for brevity. In these circumstances, a discrete signal 


is implied. 


By the well-known sampling theorem, there is no loss of information if the signal is sam- 


pled at a sufficiently fast rate F,. Recovering the original band-limited continuous signal 


from the discrete signal can be done using Whittaker-Shannon interpolation 


fiy)= y y f [mi ,nz|sine (=2*) sinc (2) 


nj =—%N2=—0o 


-( y ¥ fem] 8(e— mst) ) sin (5) sine ts 


nj=—°%n2=—e 


(2.1) 


where the «* operator represents continuous time convolution, 6(t),t2) = 6(t,)6(t2) is the 
2D Dirac delta function, and sinc (t) = sin(at)/(zt) that is defined for all t € R. 


In the rest of the chapter we introduce the concept of frequency representation of 2D sig- 
nals, followed by multiresolution decomposition. These signal processing concepts will 


play a major role in the phase reconstruction algorithms presented in this dissertation. 


2.1 Two-Dimensional Fourier Transform 
We define the continuous 2D Fourier Transform (FT) as 


F (ky, Ky) = FT { f(y) } = 


aes . me) 
=| / f(x, y)e PRY) dxdy Pen 


with k,, K, having the appropriate units to ensure that the quantities K,.x and k,y are dimen- 


sionless. 


Extension of the Fourier transform to the 2D sampled signal yields the 2D Discrete Shift 
Fourier Transform (DSFT) defined as 


F(@)= y y fm, ngje Form Fenn) (2.3) 


Nj =—%Ny=—°e 


The units of @, @2 are dimensionless expressed in radians or radians per sample, which 
has a 27 periodicity in both dimensions of the DSFT; the two frequency variables are 


usually constrained to the intervals 


—-A<KO<7 (2.4) 


for i= 1,2. All other values of @ are associated to aliased frequencies from the sampling 
process. If f[nj,n2| = f(n1Ax,n2Ay), then @; and @p are related to ky, Ky as 


@, = 27K,Ax, @) = 27kK,Ay. (2.5) 


2.2 Two-Dimensional z-Transform and Filtering 


In this dissertation, we numerically process discrete signals. In the 1D discrete time case, 
the z-transform is the tool which most conveniently describes linear shift-invariant (LSI) 


operations. It is defined as 


F(z) =2{flnj}= » f[njz", zE€ROC (2.6) 
where z is a complex variable and is related to the frequency domain. In general, for signals 
of infinite length, the Region of Convergence (ROC) has to be defined. However when f|7] 
has a finite interval of definition, then convergence is guaranteed for 0 < |z| < co. When 
z =e! and z € ROC, then (2.6) becomes the same as the DSFT. The inverse z-transform 
is given by 

1 
a eed =5— $F ld, 27 
fn] {F(z) ony (z)z" dz (2.7) 
In the above definition, C is a closed, counter-clockwise directed contour that encircles the 


origin and is completely inside of the ROC. 
The z-transform can be extended to the 2D case and its definition is given by 


F (21,22) = 2{f[m,n2]} = De 


nj=—©ny 


flmi,n2|z, "1%", Z1,Z2 € ROC. (2.8) 


co 


eek: 


One of the fundamental properties of the z-transform is that a shift in the domain of defi- 
nition (time or space, accordingly), corresponds to an algebraic operation in the z-domain, 
as 

Z{ fn +L1,n2 + Lo} = ASF (z1, 22) (2.9) 


where Lj, Lz are integers. By this respect, the variables zj, z2 can be viewed as shift 


operators. This leads to the use of z, z2 as unit shift operators, for which we indicate 
zy !z9"? f [ny n2] = flr +L1,02 + Ly]. (2.10) 


By this interpretation, we will make minimal use of the z-transform and rather use the shift 


operators. 


Filtering is performed by using the convolution operation. For the 2D case, the convolution 


of f and g is stated as 


co 000 


feayereesy) = ff fle.e)ge—cry—e)derder (2.11) 


—oo —oo 


for continuous signals and for the discrete signals is 


f[n1,n2] * *g[n1,n2] = y y f (m1, m2\g[ny — m1, n2 — my). (2.12) 
m,=—c M7 =—c 
Convolution expresses many physical processes and applications arise where one of the 
functions represents measured information and the other function represents a filtering ker- 
nel that modifies the measurements. In this dissertation, convolution is used to apply filters 
to the measured data of the AO system. 


In order to efficiently represent convolution, throughout this dissertation we make frequent 
use of operator notation. Using the definition of the shift operator z;, z2, and the fact that 


x|[ny — my,N2 — M2] = Zz, ""!z5"""x[n1,N2], we write the convolution of two signals as 


y(n ,n2| = y y hlmy,mg]z,"'z5"""x[n1,n2]- (2.13) 
M,{=—-PMyQ=—e 


Then by defining the transfer function of the LSI system 


co co 


Ha,a2)= YoY Alm, m)z"z,”™, (2.14) 


M,{j=—PMyQ=—e 


10 


we can write the convolution in operator form as 


ylm,n2| = A(z1,22)x[n1,n9]. (2.15) 


This representation is completely equivalent to the convolution in (2.12) and to its z- 
transform correspondent in (2.8). We view (2.15) as a short-hand notation of the convo- 
lution operator, which, when necessary, can be extended to the corresponding z-transform 
with previous definitions of the correct ROC. Equation (2.15) is equivalent to the following 
sequence of operations: 

X (21,22) = Z{x[n1, 72] } 

Y (21,22) = H(z1,22)X (Z1,22) (2.16) 

yy ,m2] = 2-{Y (Z1,22)}- 


For ease of notation, we skip the indices n,, nz and the variables z,, z2 when they are 


apparent from the context. In this way, the following expressions are all equivalent: 


y[ni,n2] = A(z1,22)x[n1,n2], 
y = A(21,22)x, (2.17) 
y= Hx. 


Most of the operations in this dissertation are based on a combination of filtering and “re- 


sampling” operations, leading to a multirate signals, described next. 


2.3 Fundamentals of Multirate DSP 


In this section, we review the major definitions and results of multirate DSP, as applied to 
general multidimensional signals. Multirate DSP will be the basis of the proposed phase 
estimation technique, which processes observed phase differences into a tree-like structure 


at different scales and then recombines them into the final phase estimation. 


In multirate DSP, there are two elementary resampling operations: downsampling and up- 
sampling. By downsampling, we eliminate samples, which is equivalent to resampling the 


signal at a lower rate. By upsampling, we interpolate between samples, which is equivalent 
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Figure 2.1. A signal is shown in (a) that is being upsampled in (b) and downsampled in (c). The 
index variable identifies the different sampling grids. Both examples use L = 2. 


n 











to resampling the signal at a higher rate. Whether we do this in the time or space do- 
main is immaterial. Although we define these operations in one dimension, they are easily 


extendable to two or more dimensions. 


Upsampling involves adding new samples between existing samples and all of the new 


samples are zero valued. The upsampling of the signal x|n] as y|n] is expressed as 


: (2.18) 
0 otherwise 


vin) = a if n/L€ Z, 


where L is the integer upsampling factor (two or larger). Adding zeros between data points 
is a fundamental signal processing tool to be used in conjunction with filtering. The length 
of y[n] is now L times the length of x|n]. An example of upsampling a signal in Figure 2.1 


(a) is shown in Figure 2.1 (b). 


As previously stated, downsampling is the reduction of samples and is defined as 
y|[n] = x[Ln] (2.19) 


where L is the integer downsampling factor (two or larger). Whereas upsampling retains 
all original samples, downsampling loses information. An example of downsampling is 


shown in Figure 2.1 (c). 


In this dissertation, the resampling is always performed using a factor of two on multidi- 


mensional signals, in which the operators are treated independently and operate only in 
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Figure 2.2. Noble identities in block diagram showing equivalence of (2.22) for both dimen- 
sions. The serial combination of two operations that both occur in the same dimension cannot 
change order without changing the filtering operation. The subscript on the downsampling arrow 
indicates the dimension and is then followed by the downsampling factor. 
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their dimension. To simplify the notation, the upsampling operator is defined as 


. [211 ,n2| = x|n1,n2] 
y=HUix = 
2 1 =0 
yi [2m + 1,19] (2.20) 
y2[n1, 2ng| = x|[n1,n2] 
y2 = Uxnx & 
y2|ny,2n2+ 1] =0 
for each dimension. The downsampling operator is similarly defined as 
yy =Dix = yi |mi,n2] = x[2n1,n9], 
y2 = Dox = y2[m1,n2] = x[m1, 2n2], (2.21) 


Both operations of upsampling and downsampling are always combined with filtering oper- 
ations, as shown in Figure 2.2. A full treatment is given in numerous well-known references 
(see [15], [16]), and not reproduced here for brevity. Just to give an idea, the reason we 
filter before downsampling in Figure 2.2 is that lowering the data rate causes aliasing, thus 
requiring an anti-aliasing filter. As far as upsampling, the interpolation of zeros is clearly 
unsatisfactory. However, the addition of a filter is equivalent to interpolating the signal by 
the impulse response of the filter. Again, it is always an advantage to represent all of these 
operations in terms of operators, which we can easily and efficiently manipulate. Operator 
notation is particularly important in the problem we have at hand, which, as we will see, is 
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based on resampling in two dimensions and can lead to a fair amount of complexity. 


The Noble identities [15], [16] are used in multirate signal processing to describe the equiv- 
alence of swapping the ordering of downsampling-filtering to filtering-downsampling. The 


Noble identities correctly establish the relationship in instances such as 


y, = DH (zi) x = H(z1)D1x, 


‘ (2.22) 
y2 = DoH (z5)x = H(z2)Dox. 


The equivalence of (2.22) in block diagram form is shown in Figure 2.2. For example, let 
H (zi) =z, ~~ then the left-hand expression of (2.22) is 


yt [mm] = Dy zy’ x[n] = Dy x[n — 2L] = x[2m — 2L] = x[2(m — L)] (2.23) 
and the right-hand expression is 
yi fm] = zy, "Dyx[n] = z7*x[2m] = x[2(m —L)], (2.24) 


which are the same expression and equivalent. Similar expressions could be stated for 
upsampling. Since these operators are linear, they can be applied to any linear combination, 
thus to any arbitrary filter H(z). 


2.4 Miultiresolution Representation of a Signal in a Tree 


Structure 
A wavefront phase reconstruction algorithm that is based on multiresolution decomposition 
is described in this dissertation. This concept is related to the wavelet transform, used in 
a variety of applications. The discrete wavelets are a particular class of filter kernels and 
they form a basis (rather than a Fourier transform basis or similar) for signal representation. 
We use an implementation of discrete, multiresolution wavelets that are orthogonal for the 


work presented herein. 


This decomposition is used in a number of applications, such as signal compression. For 
example, in computer image storage, the JPEG2000 standard uses the discrete wavelet 


transform [17]. This compression is possible because most images are vastly oversampled 
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“Analysis Section” “Synthesis Section” 


Figure 2.3. Tree structure of a quadrature mirror filter for single dimensional signal x|n]. Tree 
structures with the perfect reconstruction property result in y[n] being equivalent to a shifted 
x|n]. The channel with G(z) is a low-pass filter and the channel with H(z) is a high-pass filter. 


relative to their content and can be compactly expressed at coarse sampling. A compression 
algorithm can exploit the multiresolution signal structure for improved compression ratios 


at very little computational cost. 


In this dissertation, the AO wavefront sensor has a relationship to the low and high pass 
filter nature of the wavelets. This relationship is exploited to apply the general orthogonal 


wavelets to reconstruct a signal of interest in AO by using a tree structure. 


The multiresolution decomposition of a signal is recursively obtained as a tree structure, by 
successively dividing the signal into two components: low frequency (called the “approx- 
imation’) and high frequency (called the “details”). Since each one of these components 
retain half the bandwidth of the signal, the sampling rate can be reduced by a factor of two, 
so that the overall data rate stays the same. 


Multiresolution decomposition leads to the structure shown in Figure 2.3 called the guadra- 
ture mirror filter (QMF). In particular the “analysis section” performs the decomposition 
into the two channels (approximation and details), while the “synthesis section” recom- 


bines them into the original signal. The approximation and details are given by 


(2.25) 


It is important to notice two things: 
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Figure 2.4. Tree structure of a QMF for single dimensional signal x|n]. The approximation signal 
a;|m] is processed through a second level tree structure which results in a[¢] and do|¢]. 


1. There is no change in overall data rate between the given signal x[n] and the two 
components alm], b[m]; 

2. It can be easily shown by standard arguments that if the two low-pass filters, G(z), 
G(z) and the two high-pass filters H(z), H(z) are ideal with bandwidth of F,/4, then 
there is no aliasing and perfect reconstruction occurs. We can express the output in 
operator form as 

y = (G(z)UDG(z) + H(z)UDH(z)) x (2.26) 


with U and D indicating upsampling and downsampling by two, respectively. 


When the filters are not ideal, such as Finite Impulse Response (FIR), then perfect re- 
construction can still occur, provided that the filters are specially chosen to satisfy some 
orthogonality conditions that preserve the perfect reconstruction property [16]. The filters 
that are used in a QMF system are specifically designed in a manner to cancel out aliasing 
effects due to non-ideal frequency response of the filters such that 


G(z)UDG(z) + H(z)UDA(z) =z". (2.27) 


The output y|n] is a perfect reconstruction of x|n] shifted by L, which can be thought of as 


a processing delay of the filters. This arrangement is ideal for lossless compression. 


As stated at the beginning of the section, the QMF can be replicated in a tree structure 
as shown in Figure 2.4 where the low-frequency component (approximation) is further 
decomposed by a similar QMF; the left half of Figure 2.4 depicts the discrete wavelet 
transform (DWT) for two levels and the right half is the inverse discrete wavelet transform 
(IDWT). After performing the DWT, the original signal x{n] is decomposed into az|¢], 
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Figure 2.5. The 2D QMF breaks the original signal x|m1,n2] into four components that can be 
recombined to form a copy of the signal as y|m1,n9]. 


dy|¢], and d,[m|; this representation contains the same number of samples as x|n] and each 


component contains content at different frequency ranges. 


In this dissertation, we will make use of the extension of QMF to the 2D case. The 2D 
QMF is performed by decomposing each row of an N x N 2D signal x|nj,n2], into the two 
components: a low pass and high pass, as xz|171, 72], xH|11,N2], with m; =0,1,...,N/2— 
1. Then decomposing again each of the two signals x,, xy into xzz|m1,m2], xLH[m1,m2| 
and xyz,[m1,m2], XHH|M1,m2| again with the indices m; = 0,1,...,N/2—1. The result is 
the QMF structure in 2D as shown in Figure 2.5. 


The four output channels can be described similar to the QMF terminology where x ;, is the 
approximation and xpy, xy, and xyy are the details in the horizontal (LH), vertical (HL) 
and diagonal (HH) directions. Each of the channels is now one fourth of the original in 
size. This reduction in signal size scales with the dimensionality, and thus makes the tree 
structure an excellent tool for high-dimensional problems. To perform the 2D DWT, we 
take xyz, and recursively decompose the low-pass component through a 2D QMF. Likewise, 


the four components can be recombined into the original signal by the synthesis filters. 


In Figure 2.6, each row of x|n1,n2] is processed independently. The resulting signals are 
xy|m,n2| and xq[|m1,n2]|. Following this, the next step is to process each column, as shown 


in Figure 2.7. The four signals of the 2D QMF xz , xLH, XHL, and xy are constructed. 


With large data sizes, it is appropriate to perform the 2D QMEF recursively on the xz; 
component. A second level QMF result is shown in Figure 2.8. The original signal is then 


represented with seven signals, each of which contains signal content at different frequency 
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Figure 2.6. Each row of input data is passed through a low-pass and high-pass filter, and then 
downsampled. This process results in a row of data that is the same dimension as the original 
row. Figure is courtesy of Roberto Cristi, from [18]. 
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Figure 2.7. After each row of Figure 2.6 is processed, then each column is processed in the 
same manner. This process results in the four signal blocks. Figure is courtesy of Roberto Cristi, 
from [18]. 


ranges. 


As an example of the usage of the 2D QMF, the decomposition of an image is shown in 
Figure 2.9. The approximation is a lower-resolution version of the original sized image. 
The details can be seen to be contain smaller features and a ghost-outline of objects can be 


seen. 
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Figure 2.8. The 2D QMF can be processed recursively. Figure is courtesy of Roberto Cristi, 
from [18]. 


In this chapter, we discussed the signal processing fundamentals that are necessary to un- 
derstand generalized orthogonal wavelet phase reconstruction. These concepts included the 
Fourier transform, the unit shift operator z, convolution, resampling, and multirate signal 
representation using quadrature mirror filters. With this knowledge, we can now explain 


the importance of phase and our algorithm to estimate it using wavelet filters. 
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approximation 


256 








details 


Figure 2.9. A two-level 2D QMF decomposition of a square image is shown. Figure is courtesy 
of Roberto Cristi, from [18]. 
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CHAPTER 3: 
Wavelet Wavefront Reconstruction 





Telescopes image distant objects by collecting collimated light and focusing it onto a sen- 
sor. The sensor is usually a photographic camera, but can be replaced with a variety of 
optical instruments, e.g., a spectrometer. The telescope pupil and sensor are located in two 
different planes. The incoming light enters the entrance pupil plane and diffracts from 
the aperture, and exits at the exit pupil plane as shown in Figure 3.1. The image plane is 
where a sensor can be placed that would take measurements about the distant science ob- 
ject under study. In telescope imaging, these are the important planes that define the system 
capabilities and are shown in Figure 3.2. It is a standard convention to define a Cartesian 
coordinate frame with the z axis defined to be the optical axis perpendicular to the two 


planes. Consequently, any point in either plane is characterized by the x and y coordinates. 


Telescopes collect collimated light, represented as waves propagating parallel to the optical 


axis, and are expressed using complex scalar notation as 
u(x,y,2,t) = A(x,y, ze Ort ore) (3.1) 


where k is the wavenumber, @ is the angular frequency, A(x,y,z) is the amplitude, and 
(x,y,z) is the phase shift of the wave. The wavenumber k = 27/A is defined by the 
wavelength of the light, 2, and the angular frequency @ = 27.v is defined by the frequency 
of the light, v. For a wave propagating in a three-dimensional (3D) space, a wavefront 
is a surface of the wave where the phase is constant. We call it a plane wave when the 
wavefront can be represented as a planar surface. Most light emanates from a point source 
and is described using spherical waves, but after propagating some large distance, these 
waves appear planar over a small solid angle. Thus, plane waves are ideal for telescope 


collection. 


When the light wave is a plane wave propagating parallel to the optical axis, the field in 
(3.1) can be simplified to 
u(x,y,z,t) = Ael(&-or+9) (3.2) 


where we also assume constant amplitude. At the pupil, which by convention is placed at 
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Figure 3.1. A simplified model of optical system analysis in which an image is formed of the 
object. 


z= 0, the pupil function is defined as 


Pee f ats a (3.3) 
which can be used to describe the scalar field (3.2) directly inside the aperture as 
u(x,y,0,t) = P(x,y)Ael— °F?) = U(x, ye J (3.4) 
where the complex-valued phasor 
U(x,y) = P(x,y)Ae/? (3.5) 


represents the wave inside the pupil. Equation (3.5) is an idealization that the wave incident 


on the pupil is planar with a constant phase throughout the aperture. 


From Fourier Optics [19], it is common knowledge that the optical field in the image plane 
is proportional to the Fourier transform of the field in the pupil plane indicated in (3.5). 
If the phase in the pupil plane is constant, the resulting image plane field is then propor- 
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Figure 3.2. Light rays enter an aperture and reflect through the three-mirror anastigmat telescope. 
The entrance pupil plane is the large primary mirror; the image plane forms at the end of the 
telescope and is where the camera is placed. 


tional to the Fourier transform of the pupil function, which indicates best possible system 
performance. However, because of atmospheric inhomogeneities in the index of refraction, 
the plane wave is disrupted and the phase can vary as a function of spatial coordinates as 
(x,y). Additional phase disturbances can occur due to the manufacturing tolerances of 
optics such as alignment, prescription defects, and vibration. The nonuniform phase across 
the aperture caused by the system optics is called aberration [20]. To consider the combi- 
nation of these effects, we substitute @(x,y) into (3.2) and determine the pupil field to be 
of the form 

U(x,y) = P(x, y)Ael? 9), (3.6) 


The effect of the phase in (3.6) when compared to (3.5) on the resulting image plane field 


causes diminished system performance. 


Phase must be measured indirectly using sensors that can make scalar measurements of 


the irradiance, J[n1,n2|, which is proportional to the electric field amplitude squared [21]. 
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Some sensors, like interferometers, use the superposition of two waves to determine a 
relative phase between the two waves [22]. Other sensors, such as a Shack-Hartmann, 
perform a centroid operation on irradiance measurements to determine a local wavefront 


gradient relative to the unperturbed centroid location [23]. 


Wavefront reconstruction is a mathematical formulation that uses sensor measurements and 
outputs an estimate of the wavefront surface, which can be used for optical characterization 
or feedback control. The goal of wavefront reconstruction algorithms is to estimate the 
wavefront (x,y) from measurements of its gradient vector V@(x,y), which is the focus of 
Chapters 3 and 4. 


3.1 Wavefront Reconstruction in Adaptive Optics 

The main goal of AO is to compensate for the phase distortion of the incoming optical 
field. Phase compensation is provided by a DM which properly adds optical path length 
to the incoming light, thus equalizing the phase as desired. As shown in Figure 3.3, the 
command to the DM is provided by a control system and a phase sensor; the DM makes 
a mirror shape to match the conjugate of the phase, thereby making the wavefront more 
planar. Closed-loop AO systems commonly estimate of the wavefront phase to command 
the controllable DM actuators [3], [24]. Most implemented systems reconstruct the phase 


at the phase points. The control of a DM will be discussed in a later chapter. 


In some wavefront reconstruction applications, local phase differences are sensed, from 
which the whole relative phase has to be estimated. To this goal, a number of algorithms 
exist in the literature, mostly based on a least-squares solution. The AO community has 
developed several algorithms for wavefront phase reconstruction. Ideally the approaches 


would: 


1. Be computationally efficient for a large number of data points; 
2. Be robust to measurement noise; 
3. Result in perfect reconstruction that is consistent with measurements and boundary 


conditions of the wavefront using noise-free wavefront sensor data. 


The first property is essential for future telescopes that increase the number of actuators or 


sensor measurements seeking diffraction-limited system performance. The second property 
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Figure 3.3. Light enters an AO system aperture from a beacon illuminator and science object. It 
is assumed that the beacon illumination and science object experience the same distortion. The 
filled red area on the incoming light represents the undesirable distortion on the wavefront that 
is corrected by the DM. 


depends on the mathematical operations performed by the algorithm and in general depends 
on the statistics of the measurements, including noise correlations. The final property was 
discussed previously by Southwell [25] and Poyneer [26] and the difficulty arises from the 


pupil geometry interaction with the algorithm. 


The original wavefront reconstruction techniques used matrix-vector multiplies [25,27—29] 
with computational complexity of O(n”) or higher with n the total number of samples of the 
sensed gradient. Later, a Fourier transform technique was proposed by Freischlad [30] with 
further refinement by Poyneer [26] that has O(nlogyn) complexity. The computational cost 
of the algorithm is limited by the speed of the implementation of the fast Fourier transform 
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for the change of basis. The papers [26, 30, 31] treat the wavefront sensor measurements 
as a filtering operation, which is similar to the concept of this dissertation, except that their 
filtering operations occur on the global data set and solve the entire phase surface at once. 
During the same time period as Poyneer, Gilles [32] produced a multigrid preconditioned 


conjugate-gradient method with the same computational complexity. 


The first accurate wavefront reconstruction algorithm with complexity O(n) was performed 
by MacMartin, who developed a multi-resolution hierarchic reconstructor [33]. His work 
addressed the issues of controlling an actuator by using only local measurements adjacent 
to the actuator (rather than using all measurements). This “zonal” control can suffer from 
degradation of “global” modes and his work improved the performance by combining local 
and global estimates in a hierarchy. To significantly decrease computation, he downsam- 
pled by factors of 4 or more and results show that the larger downsampling factor decreases 
the relative performance and increases the noise multiplier (or noise propagator of [26]) of 
the reconstructor. In his work, MacMartin averaged gradient measurements, solved ac- 
tuator estimates at a coarse resolution, and solved for piston at the coarsest level using 
interpolation and spatial averaging. Our work focuses on the global estimate only and uses 


a downsampling factor of 2, which gives improved performance of the second property. 


Additional O(7) complexity work followed, such as that of Gilles [34]. The work of Gilles 
was a direct application of a multi-grid solver of a minimum-variance reconstructor based 
on a sparse approximation of the wavefront inverse covariance matrix. Vogel also im- 
proved sparse matrix methods [35] and a multigrid least-squares algorithm [36]. Another 
minimum-variance solver that followed Vogel is the Fractal Iterative Method (FrIM) [37], 
[38], which performs a change of basis (using a fractal preconditioner). Minimum-variance 
reconstruction is an excellent choice, as it is optimal in the sense of maximizing the Strehl 
ratio [39]. 


In the last few years, several new wavefront reconstruction algorithms have been proposed. 
Rosensteiner has produced the Cumulative Reconstructor (CuRe), which is a direct integra- 
tion reconstructor [40]. More recently, de Visser has shown the SABRE algorithm using 


B-spline basis functions [41]. 


An approach based on the wavelet representation was first applied to wavefront reconstruc- 
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tion by Dowla [42]. This original work did not fully exploit the features of the discrete 
wavelet transform in a 2D QMF; rather it was an approximation of two iterations of the 
low-pass filters and downsampling. Peter Hampton and colleagues developed an algorithm 
that used the complete discrete wavelet transform in a 2D QME and was able to perform re- 
construction using the Haar wavelet [43-45]. This dissertation extends their work to allow 
for the use of wavelets with the orthogonal property. 


In this dissertation, we will further describe the benefits of the wavelet phase reconstruc- 
tion. The wavelet technique offers a computational efficiency of O() using Finite Impulse 
Response filters. Hampton’s derivation uses the Haar wavelet and then uses either a Poisson 
smoother or recommends a de-noiser as a follow-on step [45]. Our approach in this disser- 
tation is made to be robust to noise using wavelets that have a longer basis length, yielding 
a smoother reconstruction without a follow-on step. Using noise-free data, the approach 
also yields an exact reconstruction with a zero mean of the two-dimensional data. We also 
extend on Hampton’s work to provide for a solution that requires no boundary conditions 


for the Haar wavelet on a square aperture where the side dimensions are a power of 2. 


The wavelets approach is based on a multi-resolution analysis solution type. As a conse- 
quence of how the discrete wavelet transform is employed, the grid size must be a power 
of 2. In a Cartesian coordinate frame, wavefront values are iteratively reconstructed first 
on a 2 x 2 grid, then 4 x 4, then 8 x 8, and expanded as a power of two in size for each iter- 
ation. Iteration in this context means that the two matrix dimensions of processed data are 
doubled each cycle, not that the entire data is processed repeatedly. However, there is no 
preconditioner or approximation required. The solution algorithm constructs the data for 
each iteration entirely using the slope measurements provided from the Shack-Hartmann 


wavefront sensor. 


The outline of the remainder of the chapter is as follows: the sensor and model that the 
wavelet phase reconstruction uses is described in Section 3.2. The wavefront reconstruction 
2D QME signals for each iteration of the analysis section is developed in Section 3.3. 
The composition of the wavefront estimate using the QMEF synthesis filters is shown in 
Section 3.4. Discussion and simulated wavefront reconstruction examples for several cases 
are given in Section 3.5. Additional steps to improve performance for data from a telescope 


with obscurations are given in Section 3.6. The chapter is summarized in Section 3.7. 


2f 





Figure 3.4. A single lenslet is shown. In this example, a centroid algorithm uses the four 
irradiance measurements at the pixels a through d to determine the center of the beam, which 
is proportional to the two slope measurements, Xr and Yp. 


3.2 Shack-Hartmann Sensor Geometry 


A wavefront reconstruction algorithm is designed to work for a particular sensor geometry. 
The algorithm presented here is designed to work with sensor data from a Shack-Hartmann 
wavefront sensor (S-H WES), which measures local gradients of the wavefront. The sensor 
has an array of small lenses, or lenslets, and is placed in a relayed pupil plane of the 
telescope. These lenslets focus the incoming light onto a focal plane detector (which is a 
standard camera of sufficiently high resolution); the spatial sampling rate of the pupil plane 


wavefront is determined by the grid of lenslets. 


The two local gradient measurements are performed by a centroid algorithm using the 
measured irradiance at the focal plane pixels, as shown in Figure 3.4. Most S-H WES treat 
the centroid operation internally and only provide the two measurement values for each 
lenslet. The S-H WFS is calibrated so that with a planar wavefront, the centroid of each 


lenslet yields zero-valued gradients. 


This sensor is modeled mathematically by the Fried geometry [27]. In particular, each 
lenslet is associated to four phase points in the rectangular grid. The model places mea- 
surements that are centered on each lenslet so that every lenslet has the two measurements 
of the vertical and horizontal components of the gradient, denoted as Xr and Yr, respec- 
tively. 
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Figure 3.5. An array of 4 x 4 lenslets shown as blue circles. The red dots at the corners of the 
grid are the phase point locations that are reconstructed from the orange-arrow slope values. 


The phase points are the locations where the wavefront reconstruction obtains phase values. 
An array of 4 x 4 lenslets is shown in Figure 3.5. These 16 lenslets are surrounded by 25 
phase points and provide 32 slope measurements. The Fried model of a Shack-Hartmann 
sensor is shown in Figure 3.6 and defines the relationship the slope measurements to the 
phase point values as 


Xp (ny, nz] = =(¢([n — 1,n2] — O(n) — 1,02 — 1] + O[n1, 2] — [11,2 — 1)), 


(3.7) 


NI Re N|e 


Yer (ni, n2] = ~(6([m1,n2 — 1] — O(n) — 1,02 — 1] + [11,2] — O(n) — 1,79}). 
Equation (3.7) was rewritten from its original form in [27] to appear in causal form. Close 
observation of (3.7) reveals that the slope measurements can be written as a separable 
filtering operation on the phase points. In order to rewrite (3.7) in operator form, we define 
two filter functions: i 
Vi (l+<"'), 

I 7 (3.8) 
a (1-z"’), 
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Figure 3.6. The Fried geometry relationship between phase points and their measured slopes 
lattice for a single lenslet. A Shack-Hartmann sensor will have an array of these lenslets. 
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Figure 3.7. The block diagram relationship between the phase points and the Fried geometry. 


where the coefficient is chosen that splits the 1/2 in (3.7) evenly. The geometry involves 


one low-pass filter and one high-pass filter and is stated in operator form as 
(3.9) 


Figure 3.7 depicts the block diagram relationship of the Fried geometry to the wavefront. 
The form of (3.9) was first stated in [43]. However, as we will see in Section 3.3.1, there is 


not a direct solution to solve for @ based on (3.9) alone. 


The Haar wavelet is the most simple and was introduced in 1909 [46]. As it turns out, its 
two filters (low pass and high pass) are simple first-order filters which are identical to the 
functions in (3.8) that were used for the operator form. Thus, the operator notation of (3.9) 


is the connection of the Haar wavelet to the Fried geometry. 


In order to reconstruct a smooth, continuous ${n] from the slope measurements, the tree 


structure 2D QMF is employed. The signals Xr and Yr can be processed using multirate 
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DSP to provide the necessary @z7, @LH, OHL, and dyH signals for the 2D QMF. 


In a more general setting, it can be shown that the filters in Figure 2.5 that are associated to 


the orthogonal wavelet families can be factored [47] as follows 


G(z) = g(z)Go(z), 
H(z) = g(—2)Ho(2), (3.10) 
G(z) = g(z)Go(z), 

( ( 


where Go(z), Ho(z), Go(z), Ho(z) are transfer functions of FIR filters. 


In the particular case of the Haar wavelet, Go(z) = Ho(z) = Go(z) = 1;Ao(z) = —1. The 


factorization in (3.10) is at the basis of the phase reconstruction algorithm presented next. 


3.3. Analysis 2D QMF Stage 

The phase is reconstructed by the proposed algorithm in two stages: analysis and synthe- 
sis. In the analysis, we determine the wavelet coefficients of the phase at different reso- 
lutions, based on measured gradients. These coefficients are then used by the synthesis to 
reconstruct the phase. The challenge is in creating the analysis signals from the measured 
gradients, as the synthesis section is standard. In this section, we develop the mathematical 


relationship between slope measurements and the QMF signals at each iteration. 


3.3.1 Iteration for Level 1 
From observation of the 2D QMF in Figure 2.5, we can write equations which relate the 
phase points to the four channels of the 2D QMF, where we omit indices 11, nz for conve- 


nience, as 
$11, = D2DG(z2)G(z1)¢, 


$14 = D2D\ A (22)G(z1)0, 
Our = D2D\G(z2)A(z1)¢, 
dyn = D2D\ A (z2)A(z1)0. 


We have swapped the order of the operators in (3.11) for a convenience of notation and use 


£1 
(3.11) 
£1 


the superscript 1 to denote the first iteration, not an exponent. Expanding (3.11) using the 
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Figure 3.8. The 2D QMF is performed for the first iteration. At this point, half of the frequency 
domain content is directly computable (in black). The LL and HH sections must be properly 
iterated. 


factoring of (3.10), we can make substitutions from (3.9) that result in 


bt = D2D)Ho(z2)Go(z1) (g(—z2)g(z1)¢) 

= DD Ho(z2)Go(z1)Xr. (3.12) 
$uyt = DrD Go(z2)Ao(21) (g(z2)8(—z1)@) 

= DD, Go(z2)Ho(z1)Yr- 


The remaining two quantities, oi, and OF 17> do not have a simple relationship to the slope 
measurements Xf and Yr, because the filters of Equation (3.11) are both low-pass or high- 
pass, whereas the slope definitions require one of each. The quantities are shown in the 
frequency domain in Figure 3.8 with L and H corresponding to low and high frequencies, 


respectively. 


As seen in (3.12) the terms LH and HL have the proper mix of low-pass g(z) and high-pass 
g(—z) filters to generate the measured gradients Xr and Yr. The terms LL and HH do not 
contain a low-pass and high-pass filters, which indicates the need to further decimate these 


signals as shown in the next subsection. 


3.3.2 Iteration for Level 2 


The arguments presented in what follows are based on two fundamental operations: 
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1. the Noble identities, Equation (2.22) in Section 2.3, recalled here for convenience: 


y, = Di A(et)x = H(z1)Dix, 
yy = DoH (z3)x = H(z2)Dox. is 
y3 = Di A(z)x = H(z22)D 4x, 
y4 = DoH (z1)x = H(z)D2x 


2. geometric sum identities, listed in (3.14), which are properties of the Haar high-pass 


filter: 
N-1 
g(-z¥) = . =) g(—z) YN>1 (3.14a) 
é=0 
y-1 
= >: z | V2e(z)g(—z), if N is even. (3.14b) 
é=0 


Equation (3.14) reveals that high-order complexity filters can be implemented as a 
delayed summation of first-order filters and its complete proof is shown in the Ap- 
pendix. Equation (3.14a) is used when a high-pass filter is needed, whereas (3.14b) 


is used when a low-pass filter is needed. 


In this algorithm, when there is an unknown quantity, we apply the 2D DWT and break 
apart the unknown quantity into 4 new channels. We now explore the second iteration 
because we have two unknown quantities from the previous section; we must do this for 
both the or, and oi) y channels. First we will only consider oir; we start by writing out the 


expressions for the four channels in the analysis: 


Ors = D2DiG(22)A (a1) Or (3.15) 
= D2DG(z2)A(z1)D2D1G(z2)G(z1) 9, 

PHH/L = DyD\A(22)A(z1) op, 
= D2D\H (22) (z1)D2D1 G(z2)G(z1)o 


We are using the superscript 2 for the second iteration and add the /L subscript for the oi, 
data. There is additionally a/H analysis for the oi) y data. For brevity, we will only state the 
HH results at the end of this section since their development follows the same analysis, but 
we emphasize the resulting expression is not exactly the same. Examining Equation (3.15) 
reveals that while or, /L is only comprised of low-pass filters, the remaining three channels 
have a combination of low-pass and high-pass filters. If we factor the filters as in (3.10), 


we will again find substitutions of (3.9) with the measured slope data. For Pry jt We obtain 


Oi), =D2D1Ho(z2)8(—z2)G(z1) 
D2D1G(z2)Go(z1)8(z1)¢ 
=D 7D Ao(z2)G(z1) 
‘ (3.16) 
DD G(z2)Go(z1) (¢(—z3)g(z1)) 
=D 7D HAo(z2)G(z1) 


DyD, G(22)Go(21)8(22) (V2Xe) 


Using the same procedure, we can also solve for 07, L/L 8 


=D D,G(z2)Ho(z1 an 
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The final channel, or, H/L? yields two possible simplifications 


bin, =D2D1 Ho (22) 8(—22)A(z1) 


D2D, G(z2) Go(z1)g(z1)@ 
=D D\Ho(z2)A(z1) 
DD; G(z2)Go(z1) (¢(—23)8(z1)9) (3.18) 


or similarly 
=D 7D,HAo(z1)A (z 2) 
DD G(z2)Go(z1) (¢(—zt)g(z2)¢) - 


The two possible substitutions arise from the flexibility of having two high-pass filters. 
Either simplification is exact when using noise-free data. Rather than choose one definition 


over the other, we take an average 


1 Z s 

din /L = 3 P2D1Ho(z2)H (21) 
DyD\G(Z2)Go(z1) 8 (z2) (v2xr) 
(3.19) 

1 £ s 
as 5 P2D1H (22) Ho(z1) 

D2D\Go(z2)G(z1) 2(z1) (v2¥r ) : 
The averaging of (3.19) allows for some robustness against noise in the slope measure- 


ments at very little computational cost. The 1/2 coefficient simply assumes additive white 


Gaussian noise. Noise correlation statistics analysis may provide a better coefficient. Using 
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the same process for the HH data, we now state the results as 


dinyn = DxD\Ho(z2)G(z1) 
DyD) A (22) Ao(z1)g(—z2) (v2¥r ) 
dia = D2D\G(z2)Ho(z1) 


DDH (z1)Ho(z2)g(—z1) (v2xr) 


1 x 3 3.20 
OB = =D2D flo 20) Al (21) ae 


2 
DDH (z2)Ho(z1)g(—z2) (v2¥r) 
1 zt 7 
+ 5 P2D 1H (22)Ho(z1) 


DoD) Ao(22)A(z1)g(—z1) (v2xr) 


We have now completed the derivation of the second iteration and show it in Figure 3.9. 
While the equations look complex on paper, actual implementations are straightforward 
and efficient in processing performance. Every expression is simply a serial grouping of 


the filter-filter-downsample-downsample block. 


3.3.3 Further Iterations 

We are able to generalize the formulation for additional iterations. After the second iter- 
ation, there are two unknown quantities 7, IL and 7, JH We now seek to generalize the 
formulas for each level k > 2. Until the final iteration, there will still be two unknown 


quantities. We first write out the formulas 


Of /L = (D2D\ A (z2)G(z1)) (D2D1G(z2)G(z1))* 
(D2D1G(z2)G(z1)) @, 


Oho, = (D2Di G(z2)H(21)) (D2D1G(z2)G(er)) © oi 
(DD G(z2)G(z1)) 9, 

Qiu jp = (D2D1 A (z2)A (a1) (D2D1G(z2)G(z1)) 
(DD G(z2)G(z1)) 9. 
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Figure 3.9. The 2D QMF diagram shows the channels at the second iteration. The upper left 
and lower right are each divided into four channels. Only two blocks remain unknown (in red) 
that require further decomposition. 


The intent of the exponential notation is that the operations inside occur k — 2 times. We 
choose to express these equations as three groups since the left group will be factored to 
shift a filter to the right group. Again we will factor the H(z) on the left side, then move 
the g(—z) to the right. As g(—z) is swapped position with the downsampling operators, 
the Noble identities will apply, resulting in g(—2"'). Then the high-order filter will be 
simplified to a delayed summation of the first-order filter. The relationship to the slope 


measurements can then be made. The end result for the LL data is 
~ ~ ~ ~ k-2 
On jp = (D2DiAo(22)G(z1)) (D2D1 G(z2)G(z1)) 


ake1_y 
(DD G(z2)Go(z1)) (( y =) Xr). (3.22) 


é=0 
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Of 1, = (D2D1G(<2)Hol(z1)) (D2D1 G2) G(z1)) 


ee 
(D2D)Go(z2)G(z1)) (( y =) | 


l=0 


with a combination summation for the HH/L channel 
1 és e ~ ~ k-2 
Or Ss (D2D}Ao(z2)A(z1)) (D2D1G(z2)G(z1)) 


2k-1_] 
(DD G(z2)Go(z1)) (( y | w) 


me (3.23) 
+ = (DoD) H(z2)o(z1)) (D2D1G(z2)G(z1)) 


2 
grb 
(D2D1Go(z2)G(z1)) (( y | rr) 


£=0 


The HH data is again developed through the same manner and results in definitions with 
some slight differences 


Orn /H = (DD, Ao(z2)G(z1)) (D2D1G(z2)G(z1))* 
2k-2_] 


(D2D1 A (z2)Ho(z1)) (( y a Ve(—ai¥e] ; 


i (3.24) 
Orit /H = (DD, G(z2)Ho(z1)) (D2D1G(z2)G(z1))* 


geet 


(D2D1Ao(z2)A(z1)) (( y | VPa(-2)%e) 


t=0 
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and the final channel, HH /H, is again defined as a combination 


Oh /H = ; (DD) Ao(z2)A(z1)) (DoDiG(z2)G(z1))* * 


ee | 


(DDH (z2)Ho(z1)) (( y =| Veta] 


oo (3.25) 
+5 (D2DiAl(z2)Mo(21)) (D2DiG@)G(a)) 


2] 


(D2D,Ao(z2)(z1)) (( by «| VPa(-a)%e) 


=0 


The summations represent either a zero-padded shift or a circular shift of the data, and the 
choice should match the preferred implementation of how the sequences are treated for 


boundary conditions. 


By developing this level k implementation, we are able to scale the algorithm for any 2" x 
2 sized data quickly. This algorithm is possible due to the high-pass filter simplifications 
of (3.14). We are able to construct the multirate 2D QMEF signals for both the LL and HH 


channels using the measured quantities Xp and Yr. 


3.3.4 Defining the Values for Unsensed Modes 


At the final iteration, we have two sets of 2 x 2 matrices. Each one is of the form 


bee ah [es sae (3.26) 


@Ly/L PHH/L OuL/H PLL/H 


and no further downsampling is possible because each entry is a 1 x 1. The upper-left scalar 
value of the $77; channel and lower-right scalar value of the @,;/;, channel have gone un- 
determined for all prior iterations. Each value represents undetected modes of the Fried 
geometry: the piston and “waffle” modes. The piston represents the mean of the entire [n] 
data set. Since the S-H WES only measures differences between phase points and not abso- 
lute values, the mean value cannot be known and a separate sensor is required for measuring 
piston. We can assign $77 /; = 0 and accept that we are within a constant value of the ac- 


tual piston of the wavefront. The “waffle mode” represents a nuisance checkerboard pattern 
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Figure 3.10. The 2D QMF is performed iteratively on the upper left and lower right until scalar 
values remain. This example figure has three levels until only scalar values remain. 


(same amplitude but alternating sign between adjacent values) along the phase points with 
a mean of zero; it is the highest frequency component in k,, K, and we can set @;7/77 = 0 
by assumption. We show the completed 2D QME analysis section structure in Figure 3.10, 


where all values are known. 


3.4 Synthesis 2D QMF Stage 

Up until this section, all of the previous algorithm steps have been used to iteratively create 
the four channel blocks of the analysis section. We separated each unknown channel into 
four sub-channels. While we did not have the direct information for each channel, we were 


able to substitute for it using the measurements that were available. 


The analysis section is now complete and must now take the four channel blocks and per- 


form the inverse discrete wavelet transform as shown in Figure 3.11. The result can be 
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Figure 3.11. Synthesis section of the 2D QMF is shown. 


expressed in operator form as 


o* n] =G(z1)Us (G(e2)U20hF fm] +H (z2)U20 fi "Im|) 


(3.27) 
+H(z1)U1 (G(z2)U26p5'[m] +H (zo)U2¢r (ml) 


Equation (3.27) is the standard form of the synthesis section of a2D QMEF and has not been 
modified to fit the algorithm. For a given index k, both channels or /L and of, H/H Must 
be processed from their respective QMF signals. The index progresses from the largest 
iteration index, k+ 1 = kmax and decrements down to k = 1. Thus, we recursively perform 
this until we have no more four channel blocks; once k = 1, the four blocks diss Oba oy L 
and ban can be processed through the synthesis filters one final time that results in the 
estimated solution of the wavefront phase surface, @[n]. 


3.5 Discussion 


3.5.1 Resampling Haar Wavelet 

As previously discussed, the choice of wavelets has an effect on the phase reconstruction 
and its sensitivity to noise and boundary conditions. Higher-order wavelets (such as the 
Daubechies family) yield filters with longer impulse response and better filtering capabil- 
ities. The drawback of filters with a longer impulse response is clearly the fact that they 
have longer transient responses, thus extending the effects of the boundary conditions. 


On the other hand, the Haar wavelet, which yields the simplest first-order filters at all the 
stages, if properly implemented, is completely independent of the boundary conditions, 


4] 


provided the data matrix is square with dimensions as power of two. By adding a positive 


shift at the analysis network (see Figure 2.3), the four filters become 


G(z) =zg(z), A(z) =zg(—z) 


(3.28) 
G(z) = a(z), A(z) = —g(—z) 


using g(z) defined in (3.8). With this choice of filters, the “approximation” and “details” 


signals a[m] and d[m] in Figure 2.3 can be related to the input x|n] as 


1 
a|m] = —= (x|2m+ 1] + x[2m]) 
e (3.29) 
aon) = (s{2m+ 1] —(2ml), 
and the output becomes 
1 
y[2m] = = (alm] —d|m]) = x[2m| 
G (3.30) 
y[2m+ 1] = 5 (a[m] + d|m]) = x[2m-+ 1]. 


This choice of filters yields perfect reconstruction y|n] = x|[n], n = 0,...,N — 1, provided 
the data length N is even, regardless of boundary conditions. In other words, in the case 
of the Haar wavelet, the effects of boundary conditions get discarded by the resampling 
operations. If the data length is a power of 2, this will be true for all resolution levels in the 


decomposition. 


3.5.2 Effects of Filter Selection on Noise 

The major contribution of this work compared to Hampton’s derivation [43] is the extension 
to more general QMF filters. The Fried model is not always an accurate reconstruction of 
the wavefront since it only relates neighboring sample points as seen in (3.7). In this section 
we look at the Daubechies family of filters, namely dbn, with n positive integer and their 
effects on the phase reconstruction. In particular, dbl corresponds to the Haar wavelet we 
presented. The number n corresponds to different filter lengths (or equivalently polynomial 
lengths). The number of filter coefficients in G(z), H(z), G(z), and H(z) are all twice the 
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Figure 3.12. (a) The digital frequency response of G(z) for the Daubechies family is illustrated. 
The frequency response for H(z) would be mirrored at 2/2. (b) The digital frequency response 
of Go(z) for the Daubechies family is illustrated. The frequency response for A(z) would be 
mirrored at 2/2. The filtering improvement can be readily seen. 


Daubechies number. For example, Daubechies 3 uses filters of length 6. 


These factored polynomials of the wavelet factoring offer the same characteristics as the 
other approaches while also fitting into the wavelet technique. Longer polynomials can 
smooth the noise, but the choice of filter length should be considered against the dimension 
length of the data. Having a large filter length but using fewer data does not yield better 


results. 


The frequency response for the Daubechies family is shown in Figure 3.12. These filters 
are also known as “max-flat” since they are smooth at low and high frequencies. As a 
consequence, as the order increases the filters become and more selective and provide better 
attenuation of the aliased components of the noise. The noise has impact on each of the 
four channels (LL, LH, HL, and HH). This improved noise rejection implies that the noise 


will be diminished on a subset of the channels. 


As AO systems increase in size and therefore also the density of actuators and sensors, 
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(e) (f) (g) (h) 


Figure 3.13. (a) The original 128 x 128 wavefront; the remaining images are all reconstructed 
(b) with the Haar wavelet, result is the same as would be for [43]; (c) 10 dB SNR Haar wavelet; 
(d) 3dB SNR Haar wavelet; (e) 10 dB SNR with the Daubechies 3 wavelet; (f) 10 dB SNR 
with Daubechies 3 wavelet; (g) 10 dB SNR with Daubechies 9 wavelet; and (h) 3 dB SNR with 
Daubechies 9 wavelet. 


the larger filters are less sensitive to noise than the Haar wavelet and are an appropriate 
choice. For example, in Figure 3.13, we show the results of several reconstructions using 
the Daubechies family wavelets with 10 dB and 3 dB signal-to-noise ratio (SNR). 


Using only the Haar wavelet, the resulting reconstruction contains a 2 x 2 checkerboard 
pattern. The pattern is also apparent at larger resolution in Figure 3.13 (c) and (d). With the 
longer wavelet lengths, we are able to have a smoothing effect on the result, as shown in 
Figure 3.13 (e) through (h). In the paper by Hampton et al. [45], they perform smoothing 
using an iterative Poisson solver which relies on having a previously reconstructed wave- 
front as an estimate and gradient data that is independent from the estimate. In our work, 
we are able to provide the smoothing effect from longer filter lengths; which can be seen 


in comparing Figure 3.13 (e) to (g) or (f) to (h). 


3.6 Telescope Apertures 
The wavefront reconstruction algorithm presented in Sections 3.3 to 3.5 can be directly 
applied to data from a telescope with a non-square aperture and other features such as 
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segmented primary mirror and central obscurations from the secondary mirror and support 
structure. There will, however, be errors near the boundary edges where the Fried model is 
incorrect. For improved performance, this section explains how to correct for errors at the 
boundary for masked data stored within a 2" x 2" matrix; however, the boundary correction 
adds computational complexity, but still quite a bit lower than the standard least-squares 
approach. The results presented here are a full theoretical explanation, and reductions in 


operations would be used in an actual real-time implementation. 


We begin by defining the mask, or window function, as 


wal : ounce aperture (331) 
1 inside aperture 
where n = [11,72]. As we did before, we define the Fried gradient operator as 
21 82 
Vr(z1,22) = g(z1)8(—22) (3.32) 
g(—z1)g(z2) 


which calculates the values Xr and Yr for (3.9). 


With these two definitions, we can now define two sets of indices for the boundary and 


inside the aperture as 


={n | [\Vew(nll| 40}, 


B 
(3.33) 
W={n | ||Vew(n]|=0 and w{n]=1}. 


For simplicity, let us define the entire reconstruction phase reconstruction algorithm pre- 


sented in Sections 3.3 and 3.4 as an operator J such that 
$[n] = H(Vro|n)), (3.34) 


provided the mean and high frequency modes are both zero. Since all operations in the 


algorithm are linear, the entire algorithm is linear. By having this property, we can then 
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write the expression 


(3.35) 








where the first term on the right shows the gradient Xr, Yr masked by the aperture w[n] and 
we define the error, E[n| = Vr (w[n|@[n|) — w[n| Vr ¢[n]. Since it can be easily seen that 


E[n] is identically zero outside the boundary, as 


Efnl=0 © n€&, (3.36) 
the error can be written as 7 
Xx, 

E[n) = Y Ki S(n—@] (3.37) 
LcB Ye 


where 6 [n] is the two-dimensional Kronecker delta function. The terms X, and Y, are the 
corrections we need to apply at the boundaries of the aperture. We can solve for Xp and Y¢ 
by performing the Haar reconstruction operator to both sides of (3.35), which results in 


5{n— 4] 
5 | (3.38) 
0 











By taking the Fried gradient operator of both sides of (3.38) we obtain, forn € YW 


= VrH (vi ie 


Yrin 











+ ¥ xXe+ Y Tyk. (3.39) 


LEB LEB 
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Using linearity, we define the impulse responses as 


Ty {n, é) = VK (|° 4 4) e RX), 
(3.40) 


rine = vex (| . |) ex 
5(n—¢] 


forne W,£¢ Z&. These definitions can be precomputed, and have no dependence on the 


























slope measurements. 


Using the impulse responses [x |n, €] and T'y|n, €], forn € W and £ € Z&, we can solve for 


X, and Y, in (3.38) from a set of linear equations as 


XF [nl] 


Yr [n] 








—VFH (vin 








= ) Tx(n,4)Xe+ ¥ Ty[n, 4% (3.41) 
LcB LcB 

forn € WY. The left-hand side is known from the measured gradients and (3.41) yields 

ny = 2|W| equations in ng = 2|Z4| unknowns, with |W| and |A| the number of sample 

points inside the aperture and on the boundary respectively. It can easily be seen that (3.41) 

can be written in matrix form as 

tw =lig (3.42) 


with zy and Zy are the corresponding ny x | and ng x 1 vectors on the left and right 
hand sides of (3.41), and I the corresponding nw x ng matrix. Equation (3.42) is under- 
determined and therefore Zz = (rr) r Tey is solved by least-squares that can be re- 


duced in operations due to many zero-value eigenvalues [21]. 


To give an idea of the dimensionality, for a 64 x 64 data matrix containing a circular aper- 











ture with radius p = 29, the matrix T € R4976x456 However, since all gradients on the 





boundary yield redundant information, it turns out that the number of unknowns can be 


considerably reduced. In this example, the matrix P™T can be decomposed as 
I'T = UAU! (3.43) 


where A; = diag(A), are the eigenvalues as shown in Figure 3.14. The first 283 eigenvalues 
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Figure 3.14. The eigenvalues of the 64 x 64 circular aperture in monotonic order are shown. 


are zero and only M = 173 eigenvalues are not zero. As a consequence, we can factor 


r'r=UAU (3.44) 











with A = diag(A;,..., Ay), A; > 0 and U € R”%*™ orthonormal. Based on this decompo- 
sition, the corrective term Zz can be solved as 





4 
a=A UT 
= “W (3.45) 





where the matrices A 'U'T’ and U are M x ny and ng x M respectively. In the example, 
they would be 173 x 4976 and 456 x 173 respectively. All of these matrices are precom- 


puted, since they depend on the aperture only. 


Having solved for Zz, we now can use X, and Yy in (3.41), so that we have the corrected 
gradients of w[nj@ [nj as 


X(n] = Xp[nj+ )° Xd[n— 4], 
LeB 

Y(n] = Ye(nj+ )° ¥pd[n— 4]. 
LeB 


(3.46) 


Using the corrected gradients X and Y, we run the wavefront reconstruction algorithm (with 


the option of a different wavelet for better results) to obtain the wavefront phase estimate. 
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(b) 


Figure 3.15. (a) The original 256 x 256 wavefront with a telescope mask applied; (b) the 
reconstructed wavefront using the Daubechies 3 wavelet. 


In the previous section, we provide example results using a square aperture. We now con- 
sider a realistic segmented mirror telescope scenario where there is an outer edge that is 
non-square and central obscuration by a secondary mirror and its support structure. We 
simulated this by generating data on a square aperture and then using zero value entries 


outside of the telescope aperture mask. 


In Figure 3.15, the algorithm is applied to simulated data for a notional segmented telescope 
system. We do not use any boundary correction or modification of the measured wavefront 
data and the result is still successful in reconstructing the wavefront. In Figure 3.16, we plot 
the 256 pixels across a row for the original wavefront in comparison with two reconstruc- 
tions using the Daubechies 3 and Daubechies 9 wavelets. The reconstruction has errors 
near the boundary edges. Since the Daubechies 3 wavelet is shorter in filter length, it is 
able to converge to the actual wavefront values closer to the edge than the Daubechies 9. 
The Daubechies 9 wavelet also has more smoothing than the Daubechies 3 due to increased 
filter length, and its result has less error in reconstruction when far from the edges such that 


they have no influence. 


In Figure 3.17, the corrected reconstructed wavefront can be compared to the original wave- 
front and the wavefront reconstructed with the algorithm of Section 3.3 only. The improved 
performance near the boundary edge is apparent. In addition, the correction also estimates 


the wavefront hidden underneath the structural support of the secondary mirror. 
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Figure 3.16. Rows (a) 220, (b) 177, and (c) 90 from Figure 3.15 are shown. In each plot, the 
dashed line shows the original wavefront compared against the reconstructed wavefront shown 
by solid line for Daubechies 3 and by dotted line for Daubechies 9. 


3.7 Summary 

In this chapter, we presented a wavelet phase reconstruction algorithm to estimate the phase 
@(n). This quantity is important for AO systems to achieve their goal of improved imaging 
of science objects. With an estimate of the phase, a DM can be commanded to compensate 


for atmospheric turbulence which degrades the performance of telescopes. 


The algorithm presented in this chapter relies on the premise that the measurements Xr and 
Yr are the gradients of the phase function @{n]. In the next chapter, we discuss when this 
is not the case and provide a means to estimate @(n| even when the measurements do not 


satisfy this condition. 
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Figure 3.17. Three rows from Figure 3.15 are shown after applying the boundary correction. 
In each plot, the dashed line shows the original wavefront compared against the reconstructed 
wavefront shown by dotted line for Daubechies 3 and by solid line for Daubechies 3 that has the 
boundary correction applied. 
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CHAPTER 4: 
Reconstruction When Branch Points Are Present 





In this chapter, we extend the algorithm of Chapter 3 to more general rotational vector 
fields, where the irradiance may be zero at isolated points, thus making the phase undefined. 


The phase can have discontinuities, as the wavefront may be severely degraded. 


Strong atmospheric turbulence causes scintillation, which is the fluctuation of irradiance. 
Regions of the telescope pupil can have non-uniform amplitude A(x, y), or apodization 
of the light wave [20]. In some regions, the amplitude decreases significantly and nulls 
form. A particularly difficult problem can occur for wavefront sensors that take gradient 
measurements of the scalar @(x,y) field near these nulls, as the irradiance measurement to 
estimate phase will have poor SNR. This issue is referred to as the branch point problem of 


adaptive optics [48]. 


The Rytov variance for a plane wave [49] is commonly used as a metric of the strength of 


the scintillation and is defined as 
Of = 1.23C2K7/6p!1/6 (4.1) 


where C? is the refractive index structure parameter, k is the wavenumber, L is the path 
length through the atmosphere. Some AO literature uses the measurements of the Rytov 
number Oy for the same purpose. The relationship between the two quantities can approxi- 
mated by Op ~) 407; for further details see Section 8.2 in [49] and Equation (2.111) in [50]. 
Although the literature for atmospheric beam propagation [49] establishes the weak scin- 
tillation regime by Op < 1, Barchers et al. [51] states that branch points form around a 
Rytov number of Oy = 0.08 (of © 0.32), and the presence of branch points becomes sig- 
nificant at Oy = 0.2 (op ~ 0.8) such that a Shack-Hartmann wavefront sensor has reduced 


performance. 


In this chapter, we develop a modification to the slope measurements that improves the 
quality of the reconstruction. This modification is independent of the reconstruction al- 


gorithm. In the next section, we introduce branch points. Phase wrapping is discussed 


ays) 


in detail in Section 4.2. In Section 4.3, the vector field decomposition is explained. Sec- 
tion 4.4 is the description and comparison of Fried gradients and wrapped Fried gradients. 
Section 4.5 is one of the main results of this dissertation and has the least-squares phase 
estimation from the wrapped Fried gradients. In Section 4.6, we provide several examples 


of the algorithm. Finally, in Section 4.7, the conclusions are presented. 


4.1 Effects of Branch Points on Phase Reconstruction 

The branch point phenomenon was first observed by Nye and Berry in 1974 [52]. The 
original work in phase reconstruction when branch points are present was done by Fried 
and Vaughn [53]. Their analysis shows that branch points appear in areas of low irradiance. 
Phase discontinuities form between branch points along branch cuts, which prevent path- 
dependent phase unwrapping from traversing the cuts. Branch cuts are placed between 
branch points of opposite signs or between a branch point and an edge boundary of the 
surface. The path of a branch cut is not unique and they can be placed along areas of 
low irradiance by taking irradiance (SNR) into account in the reconstructor. The issue 
is how to pair branch points for branch cuts, which is not a trivial problem. Regions of 
the wavefront may not be completely surrounded by branch cuts, as this prevents phase 
unwrapping. Short branch cut lengths are desirable for AO systems with a continuous 
DM that cannot form shapes with discontinuities. The branch cut selection algorithm must 
also be computationally tractable for AO feedback control. Many branch cut selection 


algorithms were developed for Synthetic Aperture Radar [54], and can be applied to AO. 


Fried [48] continued the analysis of branch points in wavefront reconstruction. He noted 
that if branch points are present, then the measured phase difference is not a gradient of a 
scalar field. Instead, the measured phase differences have two components: a gradient of a 
scalar potential and a curl of the vector potential. The rotational curl component leads to 
“hidden phase,” as the least-squares reconstruction can only recover the irrotational scalar 
potential. In order to recover the original phase, the curl component must be zero valued. 
The vector potential is determined as the solution of a Poisson equation, which requires 
the locations of the branch points. This analysis leads to a closed-form solution of the 
hidden phase contribution from each branch point. The “total phase” is then the sum of the 


least-squares reconstructed phase and the hidden phase. 
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Tyler [55] performed an analysis of branch points by using the Fourier transform of the vec- 
tor field. He decomposed the observed gradient phase field into irrotational and rotational 
components. Both components are orthogonal to each other and the rotational component is 
shown to be hidden in the null space of the standard least squares reconstructor. The “slope 
discrepancy” was determined to be the difference of the gradient field of the least-squares 
reconstructed phase and the original measurements. Tyler defines the slope discrepancy as 
a matrix operator on the gradient measurements that can be used to reconstruct the correc- 
tion to the least-squares wavefront reconstruction. The correction procedurally follows the 


least-squares reconstructor. 


In our approach, which is presented in this chapter and in [56], we interpret the 2D vector 
field problem in terms of linear algebra and vector spaces. We begin with the Helmholtz 
decomposition of the measured vector field in terms of orthogonal subspaces of the gra- 
dient and curl components. Our approach is different from [48], [55] by changing to a 
non-orthogonal decomposition and using least-squares. We can show that a family of wave- 
fronts can be generated that match the measured gradients and our solutions can be adapted 


to the desired branch cuts. 


4.2 Phase Wrapping 

A least-squares reconstructor cannot produce the discontinuities of the wavefront caused 
by the presence of branch points, and incorrectly estimates the phase values across the 
wavefront surface. Although a smooth phase function cannot be determined, there is a 
family or ensemble of phase functions that all have the same gradient measurements and 
different algorithms may result in different phase functions using the same measurements 


for this reason. Unwrapped phase functions are also members of the ensemble. 


There are several phase quantities to note. The true phase is @[n,,n2]. This quantity can 
never be known exactly, but it can be estimated as a relative phase. The only guaranteed 
relationship between true phase and the estimated phase is that the wrapped quantities are 
equivalent, as 

W{o{[n]} = Wid [n] +C}, (4.2) 


where C is a constant. Wrapped phase is always guaranteed to be contained in a 27 range; 
for our work here we prefer the definition —z < W{@|n]} < a. The difficulty imposed by 
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the wrapping is that it is a nonlinear operator of the form 
W{o[n]} = o[n] + 27k(n] (4.3) 


where k|n] are integer values that ensure —72 < o[n] + 27k|n| < 7. It is impossible to know 
the original true phase from the wrapped values; only relative phase values can be known. 


Unwrapping algorithms can remove any sharp discontinuities by finding values for k[n] 
that result in a continuous surface for the relative phase. Unwrapped phase is not restricted 
to a particular range, but it is desirable that the surface is continuous and there are no sharp 
transitions between adjacent phase points. Thus, neighboring phase points are kept to less 
than a 27 difference to avoid any ambiguities. This definition means that the constraint for 


unwrapped phase is on its difference as 


|@ [11,22] — @[m —1,n2]| < 20 


4.4 
|@[1,n2] — O[n1,n2 —1]| < 2m a 


Increasing the spatial sampling rate is one way to avoid a 27 or larger difference between 
two adjacent phase values (or the limit imposed by the sensor). 


In general, the estimated phase output from the wavefront reconstruction is not guaranteed 
to be either a wrapped or unwrapped phase. In many cases, the estimated wavefront is 
smooth and can be considered the unwrapped phase. However, there are instances where 
this condition is not satisfied by the wavefront reconstruction algorithm, and the result 
requires a phase unwrapping algorithm to yield a continuous wavefront surface. While 
some reconstruction algorithms do perform unwrapping (such as the exponential recon- 
structor [57-59]), we treat reconstruction and unwrapping as two separate operations. In 
this dissertation, we are only presenting a novel reconstruction algorithm and not an un- 


wrapping algorithm. 
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4.3 Vector Field Decompositions 


The Helmholtz decomposition states that a vector field in the three dimensional space is 


represented by a gradient component and a rotational component as 


yw=Vo+Vxv (4.5) 














where V = [V, Vy, V]/ represents the gradient operator, @ (x,y,z) € R is the scalar poten- 











tial and v(x, y,z) € R? is the vector potential, with the outer product Vx defining the curl of 





the vector. As we can observe in (4.5), vector fields are composed of a gradient of function 
and a curl of a vector field. The gradient of a function is an irrotational (curl-free) vector 
field; whereas the curl of the vector field is a rotational (divergence-free) vector field. In 


particular, in the case of a two dimensional vector field in the x, y plane 


w(x.) = | Vel) | (4.6) 
W(x,y) 


where we assume the component along the z axis to be identically zero, the scalar potential 
is @(x,y) and the vector potential is along the z axis as v = [0,0,v(x,y)|’. These assump- 


tions lead to a simple expression of the decomposition (4.6) in matrix form as 











Wy (x,y) Vy —V, v(x, y) 
In the Fourier domain, equation (4.7) relates complex vectors as 
W.(K) = j2UK  j2Mky P(K) (4.8) 
P,(K) j2mky, —j2mK, | | V(K) , 











with ®(K) and V(Kk) the 2D Fourier transforms of @(x,y) and v(x,y) respectively, and 
K = [Ky, Ky]. Equation (4.8) corresponds to the decomposition of the complex vector ‘Y(K) 


in terms of the orthogonal reference frame defined by the two columns of the matrix in 
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(4.8). Simple matrix inversion yields the two potentials (scalar and vector) computed as 


(4.9) 


V(x) 


ee 2) | O(K) 
((j27k,) + (j2mky) ) | | j2mKky —j2Uk, Y,(K) 


poe j2Mky | ee 


The problem of phase reconstruction is to recover the overall phase we call @o(x,y) from 
observed gradients w(x, y). This problem is typical in adaptive optics [24] or Interferomet- 
ric Synthetic Aperture Radars [60], where local phase differences are observed directly. 
When the vector field is irrotational, i.e., the curl component v(x, y) is absent, the overall 


phase @o(x, y) is the same as the scalar potential @(x,y) since by definition, 


y(x,y) = Vo(x,y). (4.10) 


When the field is irrotational, we have seen that the integral of the measured gradient is 
independent of the path of integration. In practice, we do not integrate along a path, due to 
noise and the fact that we want to use all the data we have rather than only the measurements 
along the path. 


Algorithms designed for this reconstruction are based on a matrix representation of phase 
differences [26—28] as 
vec (y{:,:]) = vec (@|:,:]) (4.11) 


with vec(-) representing matrix-to-vector reshaping of sampled gradients y and potential 
@, and Fa matrix of appropriate dimensions with approximately twice the number of rows 
then columns. Solving for @ in (4.11) yields an overdetermined set of equations solved by 


least squares to estimate @ as 
vec(l:,:]) = (PT) "TT vec(y{:,:]). (4.12) 


Although in the applications of interest, the observation vector y(x,y) is made of phase 
gradients, the presence of singularities and the fact that all phase values are wrapped to lie 
within the interval |—7, 7), makes the vector potential v(x,y) to be nonzero. As a conse- 
quence, the phase (x,y) to be estimated is not the same as the scalar potential @ (x,y). The 
substitution of (4.5) into (4.12) shows that the vector potential information is lost in the 


orthogonal frame. Therefore the computation in (4.12) yields a least squares approxima- 
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tion for the scalar potential and cannot account for the orthogonal component (the “hidden 
phase” in multiple references such as [48] and [55]) associated to the null space of the 


matrix I’ in equation (4.11). 


A solution to this problem, proposed in this dissertation, is to use a different, non-orthogonal 
reference frame. This frame will be shown in the next section to be well suited to the com- 
putation of phase data in the presence of phase wrapping and singularities such as branch 
points. In order to see this, we replace the representation in (4.7) with a non-orthogonal 


frame as follows 











| W(x, y) | = Vy 0 o(x,y) | (4.13) 
Wy(x,y) Vy —Vy c(x,y) 
or, equivalently, 
Wi(x,y) |__| Vx Vx | (4.14) 
Wy (x,y) Vy 0 c(x,y) 











In this non-orthogonal frame, where the two basis vectors in the frequency domain are 


j2 7 Ky 0 [27 Ky. 
ej= 4 , a= ; or e2= J 5 (4.15) 
j21ky — j2I ky 0 


the two components ¢o(x,y) or @1 (x,y) and C(x, y) are given by the following: 


given by 


Lemma. Let c(x,y) be the curl of the vector field y(x,y), ie., 


c(x,y) = Vy W(x, y) — Vry (x,y) (4.16) 


and let (x,y), v(x, y) be the scalar and vector potentials as in (4.7). Note that the sign con- 
vention of our definition of curl in (4.16) is opposite of the standard mathematics definition; 


we do this to simplify some signs in subsequent results. 


Also define C(x, y) and ¥(x, y) in differential equation form such that 


c(x,y) = VxVyc(x,y), (4 17) 


v(x,y) = V,Vyv(x,y) 
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with the boundary condition w(x, y). The integral form of (4.17) is 


seg [ : [ o(Ay,An) dA dap + w(x,y). (4.18) 


Then the vector field w(x, y) can be expressed as in (4.13) or (4.14) with 


(x,y) + Vy" W(x, y) + wy(y), 
(x,y) — Vx? W(x, y) + wy(x) 


do(x,y) 
1 (x,y) 


(4.19) 


with w,(x) and w,(y) depending on boundary conditions. 


Proof. From the bottom equation in (4.9) and the definition of the curl c(x, y) in (4.16), we 
obtain 
(vi + v,") v(x, y) =c(x,y). (4.20) 


Substitution of v(x, y), c(x,y) with ¥(x,y), G(x,y) as in (4.17) yields 
(V.2 +p?) Py) =e(x,y) +049) (4.21) 


with w(x, y) such that 
V,Vyw(x,y) = 0. (4.22) 


As shown in the Appendix, Equation (4.22) implies that we can write w(x, y) in the form 
w(x,y) = w(x) — wy(y). (4.23) 
Substituting v(x, y) in (4.7) with V,V,yv(x,y), we obtain 
| WY) | = 
Wy(x,y) 


Combine (4.24) with (4.21) and we can rewrite it as 


oe 
W(x, y) 


x 


O(x,y) + (4.24) 











Vi Vy W(x, y) | 


V VyVx"(—¥(x,y)) 


y 


(9,3) + V7.3) + wy(x,y)) - 
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Call do(x,y) = O(x,y) + Vy v(x, y) +wy(y) and the lemma is proven. This relationship 
shows that @o(x,y) contains information from the scalar and vector potentials. The model 
in (4.14) follows immediately by adding the column vector [V,., Vy]/ (x,y) to both sides of 
(4.13) which yields 

$1 (x,y) = go(x,y) — e(x,y) (4.26) 


and proves the result. a 


The significance of this result is that, in certain cases, as addressed in the next section, the 


term @o(x,y) is the total phase and it can be computed from equation (4.13) as 


x 


o(x,y) (4.27) 


W(x, y) 
V 


Wy (x,y) 














y 


using standard techniques. From this result, a rotational field in y(x, y) can be made irrota- 
tional by combining it with its own curl. In this case, @o(x,y), 61(x,y) or any combination 
thereof, becomes a possible scalar potential function. In the example below, where phase 
wrapping causes the phase gradient to become rotational, it is shown that the scalar po- 
tential @o(x,y) coincides with the actual phase 0(x,y), which is sensed by the wrapped 


gradients. 


Based on the fact that any signal is equivalent to its own convolution with the impulse, as 


c(x, y) = c(x, y) ** 6(x)d(y), we can write 


V.C(x,y) =c(x,y) *u(y), (4.28) 


with u(-) the unit step function and the “star” operations indicating 2D convolution with 
u(x)u(y) and 1D convolutions with u(x) and u(y) respectively. For clarity, we note that the 
first line of (4.28) is equivalent to (4.18). 


The following example illustrates the results in the Lemma presented above. 
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Example. Consider the phase 
6(x,y) = phase(x+ jy), (4.29) 


with the phase wrapped to the interval |0,27). Simple differentiation yields the wrapped 
gradient of 0 and its Fourier transform, as computed in [55] 


= Bd = £8 
W(x, y) = pane = (kK, Ky) -_ TPG?’ 
ky (4.30) 
W(x, y) = ae & Wy(kK, Ky) = TE 


The actual unwrapped gradient of 0(x,y) has to take the discontinuity at y = 0, x > 0 into 


Ys leGays | a” : 4.3 
Mac ” ee laa acl poe 


From substituting (4.30) into (4.9), we can easily solve for the scalar potential @(K), vector 
potential V(K) and the curl C(k) from (4.16) as 


account as 


(x) =0, 
V a : 4.32 

a 20(K,? + Ky)’ (4.32) 
C(k) = —21 


Now we can verify that equation (4.13) holds. We can substitute from (4.17) to solve 


= JK 
V.Vy-0(x,y) = Vyv(x,y) = IFT { we} ; (4.33) 
From (4.30), the right-hand side of the above equation is V,,6(x,y), and therefore 


Vy-(x,y) = O(x,y) + wy(y) (4.34) 


with w,(y) accounting for boundary conditions. Substitution into (4.19) and using the fact 
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that the scalar potential @(x,y) is zero, we obtain 


(x,y) = O(x,y) +wy(y). (4.35) 


The quantity $o(x,y) is the total phase we want to reconstruct, on the right-hand side of 
(4.27). Substituting for ¢ on the left-hand side of (4.27), we obtain 








W(x, y) 0 _ 
or a X,Y). (4.36) 
| Wy (x,y) | | 2mu(x)5(y) | Vy bo(x,y) 
The left-hand side is VO(x,y) from (4.31), which implies 
(x,y) = do(x,y) +C (4.37) 


where C is a constant. Equation (4.37) is consistent with (4.35) and wy(y) =C. 


In the next section, where we address the sampled data implementation, we actually prove 
analytically that @9|n1,n2] and the phase sensed by the wrapped gradients differ by integer 
multiples of 27, thus yielding the same wrapped values. In other words the “hidden phase” 


is included in @o, and therefore no “slope discrepancy” is in the gradients of go. 


4.4 Fried Geometry 


In the sampled data case, we extend the concepts introduced in the previous sections by 


defining the gradient operators on the basis of the Fried geometry. 


To this extent, given the sampled phase @9[11 ,n2] = @o(m1 Ax, n2Ay) we define the gradients 


in the two directions as 


Wi [ni ,n2] = 5 (Gol + 1,22 + 1] + gol[mi,m2 + 1) — ; (do[m1 + 1,n2] + do[71,72]), 


NIN] Re 


(do[71,72 + 1] + do[71,72)) . 
(4.38) 


Yo[ni,n2] = = (Gola +1,n2 + 1] + bo[m1 + 1,72]) -5 
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Using the shift operators z;, z2, we can write (4.38) in a more compact form 


1 
wilmi,no] = 5 (21 +1) (2 — 1) bol, na], 
1 (4.39) 
wolmi,na] = 5 (21 — 1) 2+ 1) bolm, na] 
from which we define the discrete gradient operators as 
1 
Vi (21,22) = 5 (1 + 1) (22-1), 
1 (4.40) 
Va(z1,22) = 5 (1 — 1) (2 +1). 


Substituting z, = e/®! and z) = e/® into (4.40), we obtain the discrete frequency response 


Vi(@) = 2¢-J cos (S) sin (2) ; 


V2(@) = ye eae sin (F) COS (2) . 


It is easy to see that both V;(@) and V2(@) are zero when @ = [0,0] (“piston” mode) and 


of the operators as 








(4.41) 


@ = [7,7 (“‘waffle” mode). As a consequence, 
Vx{n] =0 = x[n] =Cy+C,(—1)""™ (4.42) 


for some constants Co, C; that depend on the boundary conditions. 


It is well known that Fried derivatives are good models for Shack-Hartmann sensors, which 
measure local phase gradients. However, Barchers demonstrated that the Fried geome- 
try performance degrades in high scintillation when compared to Hudgin geometry [51]. 
When branch points causing phase wrapping are present, it is imperative to properly embed 
the wrapping operation within the Fried gradients computations. The development of the 
wrapped Fried gradient presented here is sufficient for reconstructing the high turbulence 


wavefront properly. 


From Figure 4.1, the block diagram approach of the Fried geometry can be seen. In 4.1 
(a), the standard Fried gradients are shown in terms of transfer functions in z; and z2. The 
first blocks (zj — 1) and (zz — 1) provide for phase differences in the vertical and horizontal 
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Eb a abPhEEb vm 


do|n1, n2| do[n1, N2] 


(b) 


Figure 4.1. The block diagram form (a) the traditional Fried gradient and (b) the wrapped Fried 
gradient. 


directions, while the second blocks (zz + 1)/2 and (z; + 1)/2 provides simple averaging 


(half the sum) in the opposite directions (horizontal and vertical). 


When phase wrapping is present, the first blocks, which model phase difference measure- 
ments, must be augmented with the phase wrapping operation W. Wrapping guarantees 


that phase differences multiples of 27 are not sensed. 


Call VW the wrapped Fried derivative gradient in Figure 4.1 (b) and y{n], the sensed 
wrapped gradients, as 
VW, 


vw, o(n1, 2]. (4.43) 


ylni,n2| = 








From the definition of the wrapping operator W and the factor 1/2 in the averaging second 


block we can relate the gradient and wrapped gradient by 
VWoo|n] = Voo[n] + 7¢[n] (4.44) 


with ¢[n] having integer values only. 


The wrapped Fried gradients are the basis of the phase estimation presented in the next 


section. 
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4.5 Least Squares Phase Estimation from Wrapped Fried 


Gradients 


Along the same lines as in the previous section, the vector field y can be represented in 


terms of scalar and potential functions @ and v as 


| yi lm, n2| | = 
yo[n1, 72] 


Equation (4.45) is the same as for the continuous case in the previous section, and in the 


(4.45) 











(11, n2| | 


v[ny,n2| 


frequency domain this just becomes a matrix-vector operation 


Yi (@) 
P2(@) 


P(@) 
V(@) 


= e/ 2 


(4.46) 
S1C2. —C1S2 








-@) +7 | C182 S1C2 











with c; = cos(@;/2) and s; = sin(@;/2) for i= 1,2. We can easily verify that the two 


columns of the matrix represent two orthogonal vectors. 


The result of the previous section can then be extended to the sampled data case by defining 


the curl of the vector field as 
c[ny,n2| = Vow (m1, n2] — Viyo|ni, ng]. (4.47) 


with V;, V2 the standard unwrapped Fried derivatives. Then the vector field y can be 


expressed as 











V 0 
Wi [11,2] = 1 Gol na] (4.48) 
W2|n1,n2| V2 —V2 C[n, n2| 
with ¢ defined as 
c[ny,n2| = Vi V2¢e|n1,n2]. (4.49) 


In order to obtain an expression for @, first notice that any signal can be represented in 


convolution (double convolution in the 2D case) form 
c[ny no] = c[ny, na] ** 6[n1]6[n2| (4.50) 
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with 6[n| the discrete impulse, being 6[0] = 1 and 6|n] = 0 for all n 4 0. In equation (4.49), 


the product of the two operators can be expressed as 


Vi Vas) = i= Dia le= ier) 


=F (2-1) (@?-1). 


Ble Ble 


Now if we define the sequence 


uln] = (14+ (-1)”) uln — 2], 


plotted in Figure 4.2, we can verify that u[n + 2] —u|n| = 26|n], and therefore 


6[11]6[n2] = Vi (z)V2(z)ulnijalng| 
so that ¢ can be written as 
C[ny ,N2] = c[ny,No] ** U[n;]a[Nn2]. 


With these premises, we can state the main result of this research. 


(4.51) 


(4.52) 


(4.53) 


(4.54) 


Main Result. Let y|n,,n2| be the vector field of the wrapped Fried gradient of the phase 


o[11,N2|, defined as 








W [n] VW, 
$o[n| 
y2[n| VW2 
Also let its curl, c{n], be such that 
cin] = [nj 
i.e., it assumes values only integer multiples of 7. 
u[n| 
LLL 





Figure 4.2. The function a[n] is used to create C[n| from c[nj. 
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(4.55) 


(4.56) 


Let $[n] be such that 














n 0 Vi |. 
alae =| ("| én] (4.57) 
voln| | | cnj** gin] | | Ve 
with ** denoting 2D convolution and g{n] defined as 
1 es 
ql m2] = 5(z1 — 1)(e2 + Nal]. (4.58) 
Then there exist constants Cp, C; for which 
Go [n] = [nm] + Co + C) (—1)" 7 + 2700 [n] (4.59) 


with the rightmost term a sequence assuming only integer multiples of 27. 


Proof. By exactly the same arguments as in the previous section, equation (4.57) holds for 
some ${n]. What we need to show is that the estimated phase @[n] and the original phase 
$o(n] are the same apart from integer multiples of 27, a piston mode Cp and a “waffle” 
mode C;(—1)"*”2. 


The argument is based on the fact that the curl sequence c|n;,n2] assumes values which 
are all integer multiples of z. Also it is a simple exercise to verify that the sequence 


q|n| = 0, +2 for all n. As a consequence, for all n, 
c[n] ** g[n| = 27¢[n]. (4.60) 


where, again ¢/n] denotes a sequence of integer values. Recall that the relation between 
Fried and wrapped Fried gradients in (4.44) and the observed wrapped phase gradient y|n]. 
Then by substitution into (4.57) we obtain that the sequence 








(in) 4tmy =a | | (4.61) 
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1.e., it assumes values which are integer multiples of 7 for all n. Since for any sequence 


E(n] = [nj x* [2 ]d [ng] (4.62) 
and 
d[n1|6[n2] = Vi (2(-1)"" "uly — lu[nz - 1)) (4.63) 
= V> (2(—1)~!u[ny — LJu[no — 1), 
we have that 
[In| = 2V,¢|n] (4.64) 
(n] = 2V2€[n] 


In other words, a sequence of integers is the Fried derivative of a sequence of even integers. 


Finally combine equations (4.61) and (4.64) to obtain 
V ($o[n] — $[n] + 272[n]) = 0, (4.65) 


which proves the result. | 


Estimation of ¢o[n] based on sensed Fried gradients y(n] is then carried out by computing 
@(n] from equation (4.57), using either standard least squares or (as is shown in the next 
section) the multi-resolution algorithm presented by the authors in [61]. 


Then, from the result in equation (4.59), we obtain 
go [n] = W (6[n] + Co) (4.66) 


with Cp a constant determined by a reference value. The “waffle” term is usually neglected 


since the data is assumed not to contain this term. 


The algorithm for Fried geometry can be summarized as a procedural list: 


1. Collect the sensor measurements Wj [11,n2] and Wo[11,n2]. 
2. Compute the curl 
c[ny,n2] = Voy [n1,n2] — Vi Yo[m1, nz]. (4.67) 
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Irrotational 


Least Squares 


[n] = V¢o{n] or 


Wavelet 





Rotational (branch points) 





wv[n] = VWeo[n] Least Squares 


or 
Wavelet 


Figure 4.3. The block diagram compares the traditional least-squares approach to the new form 
that is capable of handling branch points. When the curl is equal to zero, the output is exactly 
the same for both forms. 


3. Compute the quantity 
V2e|n1,n2] = cln1,n2] ** g[N1,N2]. (4.68) 
4. Modify the measurement 
Wo. new[1,N2] = W2[M1,n2| + V2eln1,n2]. (4.69) 
5. Use Wi[11,n2] and W new[21,N2] in the standard least-squares reconstructor to solve 
| Wali, n2] | - 
Wo.new [N12] 


The comparison of this algorithm with the traditional approach is given in Figure 4.3. 


for b(n, nz] 
vi 


- (m,n). (4.70) 








4.6 Application to Phase Estimation 


The algorithm presented in Section 4.5 has been applied to a number of phase signals both 


geometric and simulated wavefront phase functions. 


In the following examples when noise is present, the Gaussian white noise is added to the 
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phase difference measurement quantities as 


| We noisy |, n2] | _ | VW. [m1, 2] | 4 | an,|n1,n2| | (4.71) 


Wy noisy [11 ’ n2| W [m1 ’ n2| any [m1 ’ n2| 


where & is chosen to ensure the desired SNR for simulation. The noise source models the 
uncertainty in the centroid operation of the S-H WFS. Unless stated otherwise, amplitude 


is not used in the reconstruction and no noise added to the amplitude. 


In addition, we provide a comparison with the proprietary SPhase algorithm in AOTools 
and WaveProp, developed by the Optical Sciences Company [62], [63]. SPhase uses ampli- 
tude and phase information for wavefront reconstruction and its goal is to place the branch 
cuts in areas of low irradiance for a continuous DM. SPhase also performs phase unwrap- 


ping. Thus, the goals of SPhase are different than the algorithm presented here. 


4.6.1 Example 1: Geometric Signal 
Let s =x+ jy and define p(n] as samples of the phase 


(x,y) = phase(s) (4.72) 


with sampling intervals 6, = 6, = 0.01, the number of data points N = 256 x 256 and a 
shift by 6,/2 and 6,/2 so that the singular point x = y = 0 is not in the sampling grid. 


In Figure 4.4, the 3D plot of the wrapped phase W@o|n] is shown and in Figure 4.5 the 
wrapped estimated phase Wé@[n] is shown. The reconstruction is an exact match. The 
significance of this is observable from Figures 4.6 and 4.7, where the large discontinuity 
is not apparent. The lack of discontinuity in the measurements is the importance of the 
wrapped Fried gradient model, otherwise the ridge would be in the gradient data that is the 


input to the algorithm. 


In this particular example, because the discontinuity is along the same dimension as our 
non-orthogonal correction, the result is exactly the same as the input. If the input had a 
discontinuity at a different angle relative to the origin (which would still result in the same 
wrapped measurements), the resulting output would still be the same as the one shown in 


Figure 4.5. 
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Figure 4.4. The original @[{n] phase data for example 1 is shown. 


4.6.2 Example 2: Geometric Signal 


Similarly, in Figure 4.8 the samples of the phase for the function 


(x,y) = phase ( bili =) : (4.73) 





with by = 0.5150 — j0.26, bz = 0.0050 + j0.26 and a = 0.005 + j0.005 are shown. Two 
estimates are shown: without noise in Figure 4.9 and with noise added to the observations 
(with 40dB SNR) in Figure 4.10. 


The comparison between Figures 4.8 and 4.9 along the top center shows two different 
boundaries of maximum (red) and minimum phase (blue). In this example, we show that Cp 
from (4.66) is set to a constant that causes a slightly different wrapping than Figure 4.8. The 


gradient measurements are the same and we show that the discontinuity can be positioned. 


With this example, we are able to know the amplitude and phase. If we run SPhase with 
the phase, but set the uniform amplitude to be unity, SPhase chooses a simple branch cut 
scheme of connecting the two closest branch points to one another, and the third (closer 
to the bottom) branch point straight to the bottom edge. Our algorithm connects the lower 
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Figure 4.5. The reconstructed Wé@[n] phase data for example 1 is shown. 


branch point to the right edge (due to the non-orthogonal basis). Wrapping the phase for 


either result has the same output. 


Using both the amplitude and phase information from (4.73), SPhase creates a slightly 
more complicated branch cut between the upper two branch points that takes advantage of 


the lower irradiance path between these singularities. 


4.6.3 Example 3: High Turbulence Phase Signal 

WaveProp was used to generate the algorithm input data for this example. We tried the 
algorithm under a variety of operation conditions, but only present the highest turbulence 
results here as other cases also were successful. WaveProp simulated a 1.0 meter diameter 
circular aperture in a 2048 x 2048 E-field grid. The simulation used A = 1m through a 
4 km horizontal path. The atmospheric effects were assumed to be a constant turbulence 
through 5 phase screens. The C? value is 7 x 10~'° with a calculated Rytov number of 
0.3051. 


The phase signal is shown in Figure 4.11, with estimates in Figure 4.12 (no noise) and 
Figure 4.13 (noise with 40dB SNR). For the noiseless case, the location of the detected 
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n, coordinate 
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Figure 4.6. The wrapped gradient y(n] data for example 1 is shown. The large discontinuity 
seen in Figure 4.4 is not apparent in the wrapped measured gradient. 


branch points are shown in Figure 4.15, while the branch points of the original phase as 


determined by WaveProp are shown in Figure 4.14. 


SPhase only works on this example when the correct (original) amplitude is also supplied to 
its input. Setting a constant amplitude results in a signal of little interest (even the wrapped 
output did not match the original data). The wrapped output of SPhase (using the WaveProp 
amplitude) is identical to the output of our algorithm. Thus, we can say that the amplitude 
information is important in the SPhase algorithm, whereas the amplitude is not used by our 


algorithm proposed here. 


4.6.4 Example 4: Double Spiral 
Our last example is the double spiral shear from [54]. Although this dataset, shown in 


Figure 4.16, is used to test unwrapping, we decided to include it here. Ghiglia states that 
this example has failed in unwrapping when there is noise on the measurements for all 
unwrapping algorithms covered by their book. The actual spiral data has one arm ascending 
(with, a positive n; gradient) and the other spiral arm descending with a negative n; gradient 
of the same magnitude. 
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Figure 4.7. The wrapped gradient y[n] data for example 1 is shown. The large discontinuity seen 
in Figure 4.4 is not apparent in the wrapped measured gradient. The correction term proposed 
in this dissertation will be added to yon] to create the large discontinuity. 


Our algorithm results in Figure 4.17 for no noise, and in Figure 4.18 for 40 dB SNR. In the 
case of noise, the noise can potentially cause the phase value to wrap and the horizontal 
bar pattern can form. However, in the no noise case, the reconstruction is exact. The 


determined branch point locations match Figure 3.10 in [54]. 


Since SPhase also includes unwrapping, it has difficulty on this data set. While its output 
does show the double spiral pattern, the spiral arms are flat areas. The boundary pixels 
between the spiral arms often do not fully resolve correctly and have discontinuities. The 
wrapped output of SPhase is not a good match to the original surface. One spiral arm takes 
on zero value for all pixels, and the other spiral arm has areas that are close to +7. The 


boundary pixels of the spirals in the wrapped output also have discontinuities. 


4.7 Summary 
In this research, we addressed the problem of estimating a phase signal based on observa- 
tion of wrapped local variations. This approach is based on a particular representation of 


the vector field in terms of a non-orthogonal basis which seems to be better suited than the 
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Figure 4.8. The original @[n] plotted for example 2 is shown. 


n, coordinate 





50 100 150 200 250 
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Figure 4.9. The estimated phase W@[n] plotted for example 2. 


standard orthogonal basis associated to scalar and potential field. 


It was shown that by correcting the observed gradient with a filtered curl, the overall phase 
(including what has been called the “hidden phase’) is estimated by standard least-squares 
solver. A number of computer simulations support what has been stated based on math- 
ematical analysis. A comparison with SPhase shows that our algorithm results in the 
same wrapped phase measurements, which is expected since the algorithms output dif- 
ferent phase functions of the ensemble of wavefront surfaces that have the same gradient 


measurements. The examples show that the wrapped @p is equal to the wrapped total phase. 


This approach is able to efficiently determine a wavefront surface that is a member of the 
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Figure 4.10. The estimated phase W@[n] with 40 dB SNR for example 2 is shown. Pixels with 
values close to —7 or % may wrap due to the noise. 
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Figure 4.11. A high turbulence wavefront phase [n] created using WaveProp for example 3. 


ensemble of wavefronts that all have the same gradient measurements. The approach is 
as computationally efficient as the least-squares or equivalent reconstructor chosen. The 
approach does not unwrap the phase, as we leave that as a follow on step to the output of 


our algorithm presented here. 
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Figure 4.12. The estimated wavefront W6[n] is reconstructed for example 3. 


n, coordinate 
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Figure 4.13. The estimated wavefront W@[n] is reconstructed with 40 dB SNR. 
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Figure 4.14. The known branch points for example 3 with no noise are shown. Locations were 
determined by WaveProp. Positive branch points are indicated with a red plus while negative 
branch points are indicated with a blue dot. 
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Figure 4.15. The detected branch points for example 3 with no noise are shown. Positive branch 
points are indicated with a red plus while negative branch points are indicated with a blue dot. 
Because the Fried geometry averages neighboring values, the locations on this plot are quadrupled 
compared to Figure 4.14. 


79 


n, coordinate 





50 100 150 200 250 
n coordinate 


Figure 4.16. The spiral dataset @[n] from [54]. This dataset is known to be difficult to process 
correctly. 
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Figure 4.17. The estimated phase Wé@[n] reconstructed for the spiral dataset. 
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Figure 4.18. The estimated phase W@[n] reconstructed for spiral dataset with 40 dB SNR. 
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CHAR TERS: 
Mirror Surface Control Using Optimization 





As mentioned in the introductory chapter, the goal of the AO system is to compensate for 
phase distortion. To achieve this goal, the phase distortion determined by the wavefront 
reconstruction, presented in the previous two chapters, has to be properly compensated. In 
this chapter we address the problem of setting the control actuators of a DM to compensate 


for the phase distortions detected by the wavefront reconstructor. 


The phase distortion detected by the phase reconstruction algorithms presented in the pre- 
vious two chapters, is at the basis of using mathematical optimization for mirror surface 
control that was conducted in the SMT laboratory. Although the original mirror control 
software determines actuator settings by using mathematical optimization routines, these 
routines were never validated on the hardware. The DM behavior has nonlinear character- 
istics; however, the original optimal control problem uses a linear model approximation of 
the actual hardware performance. We sought to determine whether control using the lin- 
ear model was valid, the range of operation where the linear assumption is true, and what 
the resulting performance levels were as measured in root-mean-square (rms) error of the 


wavefront surface. 


SMT is an active optics system with surface parallel actuators. Applying a voltage across 
the actuator causes the mirror surface to change. The goal of this technology is to al- 
low larger variances in manufacturing tolerances. Deviations from the intended optical 
prescription are removed by the actuators during operation. This design saves money by 


reducing costly manufacturing rework. 


The primary mirror of the SMT has six segments that are hexagonal-shaped mirrors, each 
having 156 controllable actuators. Although in the previous chapters a S-H WES was used 
to estimate the wavefront, the SMT primary mirror is measured using an interferometer 
sensor placed in front of the telescope as shown in Figure 5.1. The sensor choice allows for 
high resolution sampling of the mirror surface that would not be available with the labora- 
tory S-H WFS. A Stewart platform is used to position the interferometer along the optical 


axis of the primary mirror. The interferometer and null corrector are mounted to remove the 
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Figure 5.1. The hardware arrangement in the laboratory that was used for influence functions 
and measuring surface curvature. Image is courtesy of John Bagnasco. 


spherical aberration at the primary mirror center of curvature. Each interferometer sample 


measures the mirror surface height in units of wavelength. 


In the next section, we describe the original optimization problem developed by the SMT 
manufacturers to control the mirror surface. In Section 5.2, we explain a hardware test 
algorithm which solves a series of optimization problems and each result can be used to 
control the mirror surface. In Section 5.3, we modify the algorithm to use a multigrid 
approach for the optimization problems and reduce computation time. We summarize the 


chapter in Section 5.5. 


5.1 Original Optimization Problem 
The original SMT control software solves a constrained optimization problem for each 
primary mirror segment to determine the actuator voltages. The constrained optimization 


problem is 


minimize J=4\|Cx—all3 
x (5.1) 
subjectto Ax <b 


where the vector x contains the actuator voltages that minimize the cost function J, C is the 
linear influence function matrix, and the d vector contains the desired mirror shape. The 
matrix A and vector b model the hardware voltage limits. 
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The cost function is minimized when the actuators change the mirror surface to match the 
desired mirror surface shape. When the mirror forms the conjugate shape of the wavefront 
distortion as measured by the interferometer, the wavefront rms error decreases. The nota- 
tion || - ||2 in the cost function identifies the Lz norm [21], or the Euclidean magnitude of a 


vector. 


The vector x consists of the M = 156 actuator voltage biases from the nominal operating 
voltages. After solving the optimization problem, the software commands the actuator 


voltages. 


The linear influence matrix C has as many rows as the number of measured samples, P, 
and as many columns as the number of actuators, M. The influence function for actuator m 


is defined as 


OPDn(V) — OPDn(V>) 
Vi — V2 





fn[m1,n2| = (3.2) 
where OPD,,(V) is a discrete 2D optical path difference (OPD) for the applied voltage V 
to actuator m. Each OPD is calculated from multiple interferograms collected by an in- 
terferometer. Forming each influence function requires 2 measurements, one with positive 
actuator voltage bias V; and the other with a negative voltage bias V2. Both V; and V2 are 
chosen to have the same distance from the nominal voltage and represent the range over 
which the actuators are assumed to have linear operating characteristics. Further details on 
measuring influence functions are found in [6] and simulated influence functions can be 


created by using integrated optomechanical analysis [64]. 


Although the interferometer collects approximately 1,000 x 1,000 samples per OPD, a 
segment only covers a fraction of the area. Each segment has a mask that identifies which 
samples display the segment surface. As a result of the optical configuration, a particular 
segment has P ~ 60,000 samples that overlap the measurement. Each masked fi,[71, 12] is 


formed into a column vector 
Cm = vector(masked( fi) ) (5.3) 


where masked(-) keeps the samples of the measured mirror surface and discards all others. 
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Figure 5.2. The C matrix processing is shown for a single simulated influence function. 


After all of the influence functions are collected the C matrix is defined as 


C= Cj C2 ++: Cy]. (5.4) 


The process of creating C is shown in Figure 5.2 using simulated data with approximately 


8,000 samples on the surface. 


The vector d is the desired wavefront as measured by the interferometer and is of the same 
dimensions as a single column of C. Immediately before running the optimization, these 
data are collected. The optical system has alignment issues that cause large piston, tip and 
tilt aberration. These modes are removed from the data, as they are not of interest to correct 
in this study. 
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Table 5.1. The dimensions of the quantities used in the optimization problem where M = 156 
and P = 60,000. 


Variable Dimensions 





A 2MxM 
b 2Mx1 
C PxM 
d Pxi 
x Mx] 
lower Mx 1 
upper Mx 1 


Figure 5.3. The optimization problem combines the SMT influence functions to create the desired 
mirror surface shape by treating each x; as a scaling factor for the influence function. 


The constraints Ax < b can be used to form a convex hull [65] using A and b of the form 


I 
. wa PF). (5.5) 
—I lower 


The set of x values that satisfy the constraints is called the feasible set and the optimization 


A= 





solution must be contained in this set. The dimension of A is 2M x M and dimensionality 
of b is 2M x 1, resulting in twice as many constraints as actuators. We use lower and upper 
to signify bounds on each actuator. The implementation used by the SMT developers was a 
constant vector that constrained each actuator to the same range used to calculate influence 
functions in (5.2). 


Table 5.1 summarizes the dimensions of the optimization problem. Since M < P, the 
optimization problem is a vastly overdetermined linear system of equations. As shown in 
Figure 5.3, each actuator is commanded to achieve the desired surface using the obtained 


optimal solution x*. 


To solve the optimization problem of (5.1), the control software uses the LSQLIN optimizer 
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function in the MATLAB Optimization toolkit. The algorithm details are documented 
in [66], [67]; however, one significant limitation of the optimizer function is that it only 
returns the final solution and the sub-optimal solutions of the trajectory are not available by 
design. For the general use-case scenario of LSQLIN, intermediate sub-optimal solutions 
are of little interest. Cohan and Miller state that control using an optimization problem so- 
lution of a modeled segmented mirror matches to within 7% of NASTRAN finite element 
model prediction [68]; however, we sought to test this on hardware and to determine if the 
hardware response is similar to the linear influence function approximation. To do so, we 


need more than just the final optimization solution. 


The linear influence function model assumes that the actuators operate independently and 
their combination follows the principle of superposition. In the actual hardware, we expect 
that the system is not truly linear and that actuators have nonlinear coupling. With only a 
single optimization solution to compare against the hardware, we cannot collect significant 
data to make a determination as to whether the linear system model accurately represents 
the hardware characteristics. 


In order to produce more than one solution to test, we must create a trajectory, or sequence 
of solutions. In the next section, we present our developed technique to create a trajectory 
of actuator voltages to command the DM. We can determine the linear region from the 


trajectory. 


5.2. Trajectory Creation Algorithm 

We developed an algorithm to generate a trajectory from the solutions of a sequence of 
optimization problems with varying constraints. The solutions can be used to compare the 
linear influence function model against the hardware performance. We wanted to generate 
a series of solutions with a “small” step from the previous iteration along the trajectory 
to the final result x*. We define “small” steps as a combination of two constraints: a 
norm constraint and a moving boundary constraint. First, the LZ. norm of the actuator 
change in value is forced to be less than or equal to a chosen constant. Second, each 
actuator movement is individually constrained from the previous iteration value. Since 
each individual actuator has a boundary constraint, placing a constraint on the LZ. norm 


limits the number of actuators that are on the boundary constraint. This combination of 
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constraints allows all actuators to change value, but also have a new solution that is “local” 


to the previous iteration. The algorithm form to create a trajectory is 


iterate 
j 


minimize J= ||Cx;—dl|, 
j 
subject to lower < x; < upper (5.6) 
IIx -xj-1|], < @ 
—B <xj-xj1<8 
stop condition IIx; —xj-1||, <€é 


where the first constraint enforces the hardware voltage limits and we add two new con- 
straints. We use the threshold values a and B to limit the change of values between iter- 
ations and € to determine when the trajectory has “settled” and the algorithm is finished. 
We initialize the problem with j = 1 and xo = zeros(M, 1) and the iterations cease when 
the stop condition has been satisfied. 


The form of the new constraints only affect the change per iteration and do not affect the 
final optimal solution x* for our original problem; the final iteration solution is equivalent 


to the solution of the optimization problem (5.1) to within numerical rounding. 


To give an idea of how the combination of constraints work, an example trajectory is shown 
as a projection into the 2D space of actuators x; and x2 in Figure 5.4. For this example, each 
vector x; contains variables x; to x, (k > 2). The norm constraint prevents all of the variables 
from moving beyond a “small ball” per iteration. The radius of the norm constraint when 
projected in 2D space of x; and x2 is a function of the orthogonal variables x3 to xz. Each 
moving boundary constraint ensures that every actuator does not change by more than a 
threshold value which forms a box shape. For solutions x; and x2, the boundary constraint 
was the active constraint that restricted the solution, whereas solution x3 was restricted by 


the norm constraint. 


The norm constraint is not compatible with the LSQLIN solver, which is designed for equal- 
ity and inequality constraints. For this reason, we use SeDuMi [69] as our solver for the 
optimization problem. Although we could use the SeDuMi routines directly, we instead 
choose to use YALMIP [70] to construct the optimization problem. YALMIP translates 
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Figure 5.4. An example trajectory of xo to x3 is shown as a 2D projection. The blue circles 
indicate the @ norm constraint and its radius is a function of variables x3 to x, (which are not 
shown). The red squares show the convex B bounding box constraint. 


optimization problems into forms understood by solvers and can simplify implementation. 


For each run of the optimization problem, SeDuMi returns a new solution that has a lower 
cost function result. We then continue to iterate until the norm difference between x; and 
x;-1 18 less than threshold €, which indicates that SeDuMi has settled near the optimal 
solution x*. With the combination of constraints, the correction begins with low spatial 
frequencies. As the iterations progress, finer detail is possible in the result. 


A drawback to this approach is that the computation time for SeDuMi is much longer 
than LSQLIN since many optimization problems are being solved. The computation time 
was not an important factor to consider as this control software was not intended for real- 
time feedback. However, as a means to decrease the processing time, we implemented the 


optimization problem on multiple grids. 


5.3. Multigrid Optimization Problem 
The system of equations is overdetermined since P >> M by an factor of 400, and the 


optimization algorithm must perform many linear algebra computations. 


Adjacent OPD measurements are similar in value, which led to the idea of using the low- 
pass filters of the Discrete Wavelet Transform to consolidate measurements. Since we are 
using only the low-pass filters, this is considered a multigrid approach [71]. In Chapter 


2, we discussed multiresolution analysis of signals. The significant difference between 
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multiresolution and multigrid is that multigrid methods discard the high-frequency content 


whereas multiresolution methods do not [72]. 


In Section 5.1, we described the construction of the C matrix and d vector. For our multi- 
grid technique, we create C at the multiple grids. We can use operator notation to express 


the resizing of influence functions in (5.2) as 
finlimy 9] = (DDog(z1)g(Z2))'™™" finlsM2] (5.7) 


where we use the same operators and g(z) definition from Chapter 2, i is the grid number, 
and imax is the number of grids. The mask must also be redefined for each grid, which we 
can do by downsampling without filtering. At each downsampled resolution, the number 
of masked samples decreases by a factor of 4 and the masked(-) function is redefined. Each 


column of C is still of the form 
c!,, = vector(masked(f/,)) (5.8) 
so that the C matrix at grid i becomes 


Cree. ee (5.9) 
| | || 


The trajectory creation algorithm begins at the coarsest grid. At this grid, the optimizer is 
iteratively run until the stop condition is satisfied. The solution is used as the initial guess 


on the next grid. This process continues until the optimizer is run at the highest resolution 
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and its stop condition is satisfied. We write this in algorithm form as 


iterate 
1 


iterate 
j 


minimize J = ||Cix; — dil], 
Xj 
: (5.10) 
subject to lower < x; < upper 
Iki -xj-1|], So 
—B <xj-xj-1< 8 
stop condition IIx; —xj-1||, <€E 


where d; is the matching data for grid i. 


5.4 Experimental Results 

To perform a hardware test, a modification was made to the optical arrangement to add 
another DM into the beam path of the interferometer, which is shown in Figure 5.5. This 
DM has 140 actuators that act locally on the mirror surface. This additional DM was added 
because the primary mirror actuators do not have sufficient actuation range for the magni- 
tude of wavefront error. Rather than command the primary mirror actuators, its actuators 
are set to manufacturer voltages and we use the optimization problem to determine actuator 
voltages for the small DM only. The optical configuration has M = 55 actuators near the 
primary mirror segment under study; all other actuators were kept at their nominal flat- 
mirror position. The allowable range of actuator values for the control software to the DM 


is 0 <x < 100, which uses percentages and not voltages. 


For the optimization problem, we set « = 10, B = 1, and € = 10~*. These were chosen 


arbitrarily and give a sufficient trajectory with the desired “small” actuator movement. 


For the multigrid algorithm, the approximate number of masked samples for each grid is 
given in Table 5.2. We stopped at the fifth iteration since the next level would result in 
fewer equations than unknowns (P < M). 


We show the process of (5.10) visually in Figure 5.6. The DM multigrid trajectory first 


solves optimization problems on the coarsest grid. Once a solution has been determined 
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Figure 5.5. Left: The DM table is shown with an overhead view of the optics platform. The laser 
follows the blue arrows. Right: A view of the optics table that labels the components. Images 
are courtesy of John Bagnasco. 


Table 5.2. The number of samples used for each grid of the multiple resolutions. 





Grid level Number of samples 





1 P = 60,000 
2 P = 14,000 
3 P = 3,500 
4 P x 800 
5 Px 190 





for a grid, the result is used as the initial condition on the next higher grid. 


The multigrid trajectory algorithm was run on the same data as the single grid trajectory. 
A comparison of the cost function as a function of time is shown in Figure 5.7 for each 
approach. Although both approaches have the same final optimal solution, the multigrid 
algorithm finished approximately 2.5 times faster. The rate of decrease of the cost function 
is thirty times steeper for the initial two multigrid solutions compared to the initial two 


single grid solutions (see the tick marks in Figure 5.7). 


We expect that for each iteration, the cost function will decrease. The number of iterations 
at each level i is shown in Table 5.3. In Figure 5.7, we see that while the single grid cost 
function monotonically decreases, this is not the case for the multigrid. The cost function 


increase is understood by examining Figure 5.8. The cost function is evaluated at each res- 
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Figure 5.6. The DM multigrid trajectory first solves optimization problems on the coarsest grid 
and works towards the highest resolution. 


olution using C; and d;. We see that each segment of the plot is monotonically decreasing. 
When switching to a higher resolution, the cost function has an increase. Thus at each grid 
level, the optimizer is doing the best available solution for the influence functions at that 
grid and cannot correct for higher resolution distortion. The rate of cost function decrease 
at the coarse grid is significant compared to using the full grid, which implies that the mir- 
ror is effectively controlled at the coarse grid. There is not a substantial improvement to 
control the mirror at the full resolution. With much higher actuator density (M >> 55), we 


would expect that the mirror surface would improve at higher resolution control. 


Another feature to note in Figure 5.8 is the distances between the tick marks which indicate 
an iteration of the optimization problem. At the high resolution, the tick marks are broadly 
spaced because of the time required to perform the large-scale linear algebra. The multigrid 
approach solves each iteration much faster, though it has to perform more iterations to 


arrive at the same solution. 
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—A— Multi grid (evaluated with large C matrix) 
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Figure 5.7. The cost function evaluated at the highest resolution and plotted against the pro- 
cessing time of the trajectory creation algorithm for single grid and a 5-level multigrid. 


Table 5.3. The number of iterations for multigrid and single grid algorithms. 





Grid level Number of iterations 





Multigrid 


1 
2 
3 13 
4 
5 





Single grid 22 





The hardware response shown in Figure 5.9 is compared against the linear model prediction 
using the multigrid solutions. Up until approximately 5 optimizer iterations, the agreement 
between the linear model and hardware response is very close. After about 10 optimizer 
iterations, the model does not accurately predict the hardware response, which is due to the 
expected nonlinear behavior of the mirror [73], [74]. Despite this, the mirror surface does 
not see a significant change in the wavefront rms error as it levels off. However, while we 


can see that the hardware response has small oscillations in the wavefront rms error, the 
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time (s) 


Figure 5.8. The cost function evaluated at the same resolution as the iterative solution and 
plotted against time. 


overall wavefront rms error is fairly constant. The changes in the wavefront rms error can 
be attributed to nonlinearities in the mirror surface and measurement noise in the interfer- 
ometer. The optical alignment is moving due to mechanical vibration. Each interferometer 
measurement takes a duration of time and the motion has an effect on the result. In some 
cases, small areas of the mirror surface experience a different phase unwrapping result from 
the interferometer software. All of these effects attribute to the oscillations in wavefront 


error rms. 


In Figure 5.10, we show the mirror surface results from the final iteration of each grid. In 
each case, the rms wavefront error is consistent despite the changing the actuator settings. 
Although the DM has nonlinearities, the performance did not degrade due to them. 


For our optical configuration, we verified the performance of the LSQLIN solution by gen- 
erating a trajectory of solutions to test because we did not see the wavefront error increase 
as the iterations progressed. The linear range is shown to be about +10 percentage points 
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Figure 5.9. The hardware measurement is compared against the multigrid solution. 


of the actuators nominal setting. Past this range, the linear system model and hardware 
response differ due to the nonlinearities. The difference did not have a significant impact 
on the wavefront rms error. We also showed that the rms error was not influenced by using 
the full resolution interferometer image. For our system, the same level of wavefront error 


performance was possible from using a coarse grid of measurements. 


5.5 Summary 

In this chapter, we sought to experimentally test whether an optimization problem effec- 
tively controlled a DM. We sought to understand the hardware response of using optimiza- 
tion for mirror surface control and developed a trajectory creation algorithm to generate 
solutions to test on the hardware. To improve the computational efficiency, we used a 
multigrid approach and the optimizer solves the trajectory algorithm on several grids. The 
multigrid approach resulted in the same optimal solution at reduced computational cost 
when compared to the original trajectory creation algorithm. We verified the performance 


of a linear influence function model and determined the valid linear range of the DM. We 
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Figure 5.10. The hardware measurement of the segment at the final solution is shown for a) the 
coarsest grid through e) the finest grid. 


would expect further reductions in the wavefront rms error by using a DM with higher 


actuator density. 
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CHAPTER 6: 


Conclusion 





6.1 Main Contributions 


This dissertation presented three contributions to AO. The first is a wavelet approach to 
wavefront reconstruction. The algorithm uses local gradient measurements from a Shack- 
Hartmann wavefront sensor to estimate the wavefront phase. The 2D QMF tree structure 
and Noble identities are used to decompose the wavefront into the spatial frequency com- 
ponents. The inverse discrete wavelet transform is performed using these components to 
estimate the phase. Our method allows for the use of orthogonal wavelet filters and using 
longer filter lengths has improved noise-rejection performance for the estimation compared 
to the Haar wavelet used by Hampton et al. [43], [44]. He later proposed a Poisson solver 
to improve the noise performance [45]. For the high SNR data case, a modification was 


shown to make the resulting wavefront estimate not depend on the boundary condition. 


This algorithm has been designed for irrotational vector fields, where there is no phase 
ambiguity and the phase is well defined at every point. This is, in general, the case of 
distortion generated by low atmospheric turbulence. Under more severe turbulence condi- 
tions, the intensity of the optical field might be zero at isolated points, thus causing phase 
uncertainties. In this case, the measured phase gradient becomes rotational and it is char- 
acterized by phase uncertainty, and branch points, which cause problems in all standard 


least-squares algorithms. 


The second contribution adapts the proposed algorithm to work when branch points are 
present from significant atmospheric turbulence. An analysis of vector spaces shows that 
the branch points cause the rotational components in the measured gradients. An approach 
using a non-orthogonal decomposition to modify the rotational vector field to be irrota- 
tional is presented. The wavefront reconstruction algorithm operates on the irrotational 
measurements and estimates the phase that is consistent with the original measurements 
with rotational components. Our results show the wrapped phase to make a comparison 
between the simulated phase and reconstructed phase. This approach can be applied to any 


wavefront reconstruction algorithm as the measurements are modified before the algorithm. 
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A third contribution is made in segmented mirror with active optics. The control signals 
to a DM are computed that minimize the wavefront error using constrained optimization 
to ensure that the hardware actuator voltage limits are satisfied. The optimization problem 
uses a linear influence function model for the actuators to determine the voltages that form 
the desired mirror surface. The best results in terms of efficiency and convergence are 


obtained with a multigrid approximation of the influence functions. 


An experiment verified the performance of the mirror surface control using constrained 
optimization. A trajectory creation algorithm solved a sequence of optimization problems 
which forms a sequence of actuator values. Each solution has small changes from the previ- 
ous voltages and results in the same final optimal solution. The trajectory provides insight 
into the validity of the linear influence function model when compared to the hardware re- 
sponse. An OPD is collected for each iteration and used to evaluate the residual wavefront 
error. The multigrid approach is shown to have a rapid decrease in the cost function when 
compared to the single grid solution and results in the same final solution as the single grid 
2.5 times faster. Using a large amount of interferometer measurements did not significantly 
change the residual wavefront error, as the residual wavefront error for the optimal solution 
at the coarsest grid was comparable to the residual wavefront error for the fine grid. For the 
linear influence function model, the number of OPD measurements only needs to be on the 
order of the number of DM actuators. Analysis determined the range of linear operation 
for the DM, which for the particular DM used in the experiment, is small over the entire 
range of operation. Mirror surface control using optimization of a linear influence model 
has similar performance to other tested linear control methods [75] to decrease the wave- 
front error of a segmented mirror. For large corrections, a nonlinear controller can obtain 
better performance for the DM [76]. 


6.2 Future Work 


The wavelet phase reconstruction algorithm was applied to a Cartesian lattice. Other lat- 
tices, such as the hexagonal lattice, are also used in AO. Sensors use hexagon shaped 
lenslets which can be packed tightly together and use all of the collected light for measure- 
ments. The wavefront reconstruction algorithm can be extended to work on measurement 


data taken in this lattice. 


98 


Any orthogonal wavelets can be used for the phase reconstruction algorithm. One may 
develop specialized filter functions for the purpose of wavefront reconstruction that uses the 
statistics of the atmosphere to determine filter coefficients. Choosing wavelet coefficients 


may also be done in an adaptive manner to improve performance. 


In the study of the algorithm performance with noisy measurements, only Gaussian noise 
was used. This noise model is appropriate for the the case of AO sensor centroid measure- 
ments. For using this algorithm on RADAR data, a noise analysis should be conducted 


with other noise models. 


The wavefront reconstruction algorithm and branch point modification were studied exten- 
sively in computer simulation. The phase reconstruction can be studied in the laboratory to 
understand the estimation error for different wavelets given simulated atmospheric condi- 
tions. The branch point modification can be studied experimentally to verify performance 


for laboratory or atmospheric branch points. 


The wavefront reconstruction algorithm operates on a set of measurements independently 
from previous estimations of the wavefront. Successive estimations of the phase can be 
smoothed or blended to analyze AO performance and understand how the wavefront evolves 
temporally. The branch cuts can significantly change paths for each estimation due to noise 
but this is undesirable for hardware performance. Future work may use previous estima- 


tions try to smooth the evolution of branch cut placement. 


Following wavefront reconstruction, phase unwrapping may be necessary if the estimated 
phase contains large discontinuities. Some wavefront reconstruction algorithms (such as 
the complex exponential reconstructor [58], [59]) also perform phase unwrapping. Addi- 
tional work may be done to analyze which phase unwrapping algorithm works best with 


the wavelet wavefront reconstruction algorithm. 


For the segmented mirror with active optics contribution, the analysis of the mirror surface 
optimization can be extended by using a DM with higher actuator density. Our conclusions 
showed a small linear range of operation over which the linear approximation model can 
be used. Work can be done to extend the cost function to the nonlinear influence case. 


One possible approach is to redefine the influence function matrix to vary as a function of 
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actuator settings as C(x). After each iteration of the optimization problem, new influence 
functions can be generated for actuators to create a new linear influence matrix. Each 
optimization problem is treated linearly and the same approach can be applied with the 


new C matrix on the next iteration of the optimization problem. 


We converted the optimization problem to modal basis by transforming C and d, which 
changes the number of rows for each. For the Zernike polynomial basis, the result was not 
satisfactory due to having fewer equations (number of polynomials used) than number of 
actuators. Most Zernike decompositions only use ~ 50 polynomials, so for high actuator 
density, this basis is undesirable as there are more unknowns than equations. We also 
explored other techniques to decrease the number of rows of C and d, such as compressed 
sensing. Random sets of samples (at least as many as number of actuators) were kept for 
each OPD. The estimated wavefront surface was usually a good match for compressed 
sensing but would miss features of the wavefront that occurred away from the random 
samples. The best performance of a reduced system of equations we observed was from 
the multigrid approach, but any technique that can exploit the vastly oversampled wavefront 


may provide further reduction in wavefront error. 


One nonlinear control approach from the manufacturer is a lookup table to determine the 
actuator settings for a desired mirror surface. The use of a lookup table in a mirror surface 


control optimization problem has not been studied. 


In the iterative optimization problem, only one value for € was used at every grid level. 
Changing this value to be larger for coarser grids may reduce the number of iterations and 


result in faster convergence to the optimal solution. 


The cost function can be adjusted for multiple deformable mirrors to be installed in the 
AO system. The optimization problem can then choose the actuator commands for each 
mirror. Adjusting the constraints can result in a variety of techniques to split the necessary 


correction among the multiple mirrors. 
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APPENDIX: Proofs 





High-Order Wavelet Simplification Proof 


This proof shows how the results of (3.14) are determined. We start with the definition 








a= Sal 
g(-z") = we (A.1) 
and the definition of the geometric series 
N-1 —N 
1— 
y's, =e (A.2) 
(=0 - 


We immediately observe that Eqs. (A.1) and (A.2) can be combined: 


N-1 
ws yz’) (-z) 3 
g(-2¥) = ae (S = oe) (—z). (A.3) 





We have now shown the first result. The second result takes some manipulation similar to 
the concept of polyphase decomposition where we split the sequence up into an even and 
odd component. We proceed from the result of (A.3) in 


= ( pe) V2g(z)g(—-z). (A.4) 


The first simplification is the realization that the sum of the two sequences can be factored 
to 1+z7!. The final factoring swaps with the Haar scaling function and needs a V2 to 
cancel the denominator. 
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Branch Point Boundary Condition Proof 
Let w(x, y) be such that 
VVyw(x,y) =0 


for all x,y real. Then w(x, y) can be written as 
w(x, y) = a(x) + B(y). 


To show this, define 
g(x,y) = Viw(x,y) 
Then V,g(x, y) = 0 and therefore, 


g(x,y) = a(x,0), 


1.e., independent of y. Substitute in (A.7) to obtain 


w(x,y) = w(0,y) + [ “g(A,0)d, 


which shows the result. 
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(A.5) 


(A.6) 


(A.7) 


(A.8) 
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